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OPTIMIZING GLYCAN PROCESSING IN PLANTS 

FIELD OF THE INVENTION 

The invention is directed to methods for optimizing glycan processing of cell or an organism 
5 containing glycoproteins with N-glycans, in particular plants so that a glycoprotein having an 
N-glycan, high mannose type, hybrid or preferably complex type N-glycans, including but not 
limited to bi-antennary N-glycans, and containing a galactose residue on at least one arm of the N- 
glycan and which are devoid of (or reduced in) xylose and fucose residues can be obtained. The 
invention is further directed to said glycoprotein obtained and in particular a plant host system 
1 0 comprising said protein. 

BACKGROUND OF THE INVENTION 

N-linked glycans, specific oligosaccharide structures attached to asparagine residues of 
glycoproteins, can contribute significantly to the properties of the protein and, in turn, to the 

1 5 properties of the organism. Plant proteins can carry N-linked glycans but in marked contrast to 
mammals only few biological processes are known to which they contribute. 

Biogenesis of N-linked glycans begins with the synthesis of a lipid linked oligosaccharide 
moiety (Glc3Man9GlcNAc2~) which is transferred en bloc to the nascent polypeptide chain in the 
endoplasmic reticulum (ER). Through a series of trimming reactions by exoglycosidases in the ER 

20 and cis-Golgi compartments, the so-called "high mannose" (Man9GlcNAc2 to Man5GlcNAc2) 

glycans are formed. Subsequently, the formation of complex type glycans starts with the transfer of 
the first GlcNAc onto Man5GlcNAc2 by GnTI and further trimming by mannosidase II (Manll) to 
form GlcNAcMan3GlcNAc2. Complex glycan biosynthesis continues while the glycoprotein is 
progressing through the secretory pathway with the transfer in the Golgi apparatus of the second 

25 GlcNAc residue by GnTII as well as other monosaccharide residues onto the GlcNAcMan3GlcNAc2 
under the action of several other glycosyl transferases. 

Plants and mammals differ with respect to the formation of complex glycans (see Figure 1, 
which compares the glycosylation pathway of glycoproteins in plants and mammals). In plants, 
complex glycans are characterized by the presence of p(l,2)-xylose residues linked to the Man-3 

30 and/or an <x(l,3)-fucose residue linked to GlcNAc-1, instead of an ct(l,6)-fucose residue linked to the 
GlcNAc-1 . Genes encoding the corresponding xylosyl (XylT) and fiicosyl (FucT) transferases have 
been isolated [Strasser et al, "Molecular cloning and functional expression of betal,2- 
xylosyltransferase cDNA from Arabidopsis thaliana," FEBSLett. 472:105 (2000); Leiter et al y 
"Purification, cDNA cloning, and expression of GDP-L-Fuc:Asn-linked GlcNAc alpha 1,3- 

35 fiicosyltransferase from mung beans," J. Biol Chem. 274:21830 (1999)]. Plants do not possess 
P(l,4)-galactosyltransferases nor a(2,6)sialyltransferases and consequently plant glycans lack the 
(3(1 ,4)-galactose and terminal cc(2,6)NeuAc residues often found on mammalian glycans. 
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The final glycan structures are not only determined by the mere presence of enzymes 
involved in their biosynthesis and transport but to a large extent by the specific sequence of the 
various enzymatic reactions. The latter is controlled by discrete sequestering and relative position of 
these enzymes throughout the ER and Golgi, which is mediated by the interaction of determinants of 
the transferase and specific characteristics of the sub-Golgi compartment for which the transferase is 
destined. A number of studies using hybrid molecules have identified that the transmembrane 
domains of several glycosyltransferases, including that of P(l,4)-galactosyltransferases, play a 
central role in their sub-Golgi sorting [Grabenhorst et al 9 J. Biol. Chem 274:36107 (1999); Colley, 
K., Glycobiology 7:1 (1997); Munro, S., Trends Cell Biol 8:11 (1998); Gleeson, P.A., Histochem. 
Cell Biol 109:517 (1998)]. 

Although plants and mammals have diverged a relatively long time ago, N-linked 
glycosylate seems at least partly conserved. This is evidenced by the similar though not identical 
glycan structures and by the observation that a mammalian GlcNAcTI gene complements a 
Arabidopsis mutant that is deficient in GlcNAcTI activity, and vice versa. The differences in glycan 
structures can have important consequences. For example, xylose and <x(l,3)-fucose epitopes are 
known to be highly immunogenic and possibly allergenic in some circumstances, which may pose a 
problem when plants are used for the production of therapeutic glycoproteins. Moreover, blood 
serum of many allergy patients contains IgE directed against these epitopes but also 50% of non- 
allergic blood donors contains in their sera antibodies specific for core-xylose whereas 25% have 
antibodies for core-alpha 1,3-fucose (Bardor et al. 9 2002, in press, Glycobiology ) (Advance Access 
published December 17, 2002) which make these individuals at risk to treatments with recombinant 
proteins produced in plants containing fucose and/or xylose. In addition, this carbohydrate directed 
IgE in sera might cause false positive reaction in in vitro tests using plant extracts since there is 
evidence that these carbohydrate specific IgE's are not relevant for the allergenic reaction. In sum, a 
therapeutic failure with a glycoprotein produced in plants might be the result of accelerated 
clearance of the recombinant glycoprotein having xylose and/or fucose. 

Accordingly, there is a need to better control glycosylation in plants, and particularly, 
glycosylation of glycoproteins intended for therapeutic use. 

DEFINITIONS 

To facilitate understanding of the invention, a number of terms as used in this specification 
are defined below. 

The term "vector" refers to any genetic element, such as a plasmid, phage, transposon, 
cosmid, chromosome, retrovirus, virion, or similar genetic element, which is capable of replication 
when associated with the proper control elements and which can transfer gene sequences into cells 
and/or between cells. Thus, this term includes cloning and expression vehicles, as well as viral 
vectors. 
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The term "expression vector" as used herein refers to a recombinant DNA molecule 
containing a desired coding sequence (or coding sequences) - such as the coding sequence(s) for the 
hybrid enzyme(s) described in more detail below - and appropriate nucleic acid sequences necessary 
for the expression of the operably linked coding sequence in a particular host cell or organism. 
5 Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an 
operator (optional), and a ribosome binding site, often along with other sequences. Eukaryotic cells 
are known to utilize promoters, enhancers, and termination and polyadenylation signals. It is not 
intended that the present invention be limited to particular expression vectors or expression vectors 
with particular elements. 

10 The term "transgenic" when used in reference to a cell refers to a cell which contains a 

transgene, or whose genome has been altered by the introduction of a transgene. The term 
"transgenic" when used in reference to a cell, tissue or to a plant refers to a cell, tissue or plant, 
respectively, which comprises a transgene, where one or more cells of the tissue contain a transgene 
(such as a gene encoding the hybrid en2yme(s) of the present invention), or a plant whose genome 

1 5 has been altered by the introduction of a transgene. Transgenic cells, tissues and plants may be 
produced by several methods including the introduction of a "transgene" comprising nucleic acid 
(usually DNA) into a target cell or integration of the transgene into a chromosome of a target cell by 
way of human intervention, such as by the methods described herein. 

The term "transgene" as used herein refers to any nucleic acid sequence which is introduced 

20 into the genome of a cell by experimental manipulations. A transgene may be an "endogenous DNA 
sequence," or a "heterologous DNA sequence" (i.e., "foreign DNA"). The term "endogenous DNA 
sequence" refers to a nucleotide sequence which is naturally found in the cell into which it is 
introduced so long as it does not contain some modification (e.g., a point mutation, the presence of a 
selectable marker gene, or other like modifications) relative to the naturally-occurring sequence. 

25 The term "heterologous DNA sequence" refers to a nucleotide sequence which is ligated to, or is 
manipulated to become ligated to, a nucleic acid sequence to which it is not ligated in nature, or to 
which it is ligated at a different location in nature. Heterologous DNA is not endogenous to the cell 
into which it is introduced, but has been obtained from another cell. Heterologous DNA also 
includes an endogenous DNA sequence which contains some modification. Generally, although not 

30 necessarily, heterologous DNA encodes RNA and proteins that are not normally produced by the cell 
into which it is expressed. Examples of heterologous DNA include reporter genes, transcriptional 
and translational regulatory sequences, selectable marker proteins (e.g., proteins which confer drug 
resistance), or other similar elements. 

The term "foreign gene" refers to any nucleic acid (e.g., gene sequence) which is introduced 

35 into the genome of a cell by experimental manipulations and may include gene sequences found in 
that cell so long as the introduced gene contains some modification (e.g., a point mutation, the 
presence of a selectable marker gene, or other like modifications) relative to the naturally-occurring 
gene. 
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The term "fusion protein" refers to a protein wherein at least one part or portion is from a 
first protein and another part or portion is from a second protein. The term "hybrid enzyme" refers to 
a fusion protein which is a functional enzyme, wherein at least one part or portion is from a first 
species and another part or portion is from a second species. Preferred hybrid enzymes of the 
5 present invention are functional glycosyltransferases (or portions thereof) wherein at least one part or 
portion is from a plant and another part or portion is from a mammal (such as human). 

The term "introduction into a cell" or "introduction into a host cell" in the context of nucleic 
acid {e.g., vectors) is intended to include what the art calls "transformation" or tc transfection" or 
'transduction." Transformation of a cell may be stable or transient - and the present invention 
10 contemplates introduction of vectors under conditions where, on the one hand, there is stable 
expression, and on the other hand, where there is only transient expression. The term "transient 
transformation" or "transiently transformed" refers to the introduction of one or more transgenes into 
a cell in the absence of integration of the transgene into the host cell's genome. Transient 
transformation may be detected by, for example, enzyme-linked immunosorbent assay (ELISA) 
1 5 which detects the presence of a polypeptide encoded by one or more of the transgenes. Alternatively, 
transient transformation may be detected by detecting the activity of the protein (e.g., antigen 
binding of an antibody) encoded by the transgene (e.g., the antibody gene). The term "transient 
transformant" refers to a cell which has transiently incorporated one or more transgenes. In contrast, 
the term "stable transformation" or "stably transformed" refers to the introduction and integration of 
20 one or more transgenes into the genome of a cell. Stable transformation of a cell may be detected by 
Southern blot hybridization of genomic DNA of the cell with nucleic acid sequences which are 
capable of binding to one or more of the transgenes. Alternatively, stable transformation of a cell 
may also be detected by the polymerase chain reaction (PCR) of genomic DNA of the cell to amplify 
transgene sequences. The term "stable transformant" refers to a cell which has stably integrated one 
25 or more transgenes into the genomic DNA. Thus, a stable transformant is distinguished from a 

transient transformant in that, whereas genomic DNA from the stable transformant contains one or 
more transgenes, genomic DNA from the transient transformant does not contain a transgene. 

The term "host cell" includes both mammalian (e.g. human B cell clones, Chinese hamster 
ovary cells, hepatocytes) and non-mammalian cells (e.g. insect cells, bacterial cells, plant cells). In 
30 one embodiment, the host cells are mammalian cells and the introduction of a vector expressing a 
hybrid protein of the present invention (e.g TmGnTII-GalT) inhibits (or at least reduces) 
fucosylation in said mammalian cells. 

The term "nucleotide sequence of interest" refers to any nucleotide sequence, the 
manipulation of which may be deemed desirable for any reason (e.g., confer improved qualities, use 
35 for production of therapeutic proteins), by one of ordinary skill in the art. Such nucleotide sequences 
include, but are not limited to, coding sequences of structural genes (e.g., reporter genes, selection 
marker genes, oncogenes, antibody genes, drug resistance genes, growth factors, and other like 
genes), and non-coding regulatory sequences which do not encode an mRNA or protein product, 
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(e.g. f promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, and 
other like sequences). The present invention contemplates host cells expressing a heterologous 
protein encoded by a nucleotide sequence of interest along with one or more hybrid enzymes. 

The teim "isolated" when used in relation to a nucleic acid, as in "an isolated nucleic acid 

5 sequence" refers to a nucleic acid sequence that is identified and separated from one or more other 
components {e.g., separated from a cell containing the nucleic acid, or separated from at least one 
contaminant nucleic acid, or separated from one or more proteins, one or more lipids) with which it 
is ordinarily associated in its natural source. Isolated nucleic acid is nucleic acid present in a form or 
setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic 

10 acids are nucleic acids such as DNA and RNA which are found in the state they exist in nature. For 
example, a given DNA sequence (e.g., a gene) is found on the host cell chromosome in proximity to 
neighboring genes; RNA sequences, such as a specific mRNA sequence encoding a specific protein, 
are found in the cell as a mixture with numerous other mRNAs which encode a multitude of 
proteins. However, an isolated nucleic acid sequence comprising SEQ ID NO:l includes, by way of 

1 5 example, such nucleic acid sequences in cells which ordinarily contain SEQ ID NO: 1 where the 
nucleic acid sequence is in a chromosomal or extrachromosomal location different from that of 
natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature. 
The isolated nucleic acid sequence may be present in single-stranded or double-stranded form. 
When an isolated nucleic acid sequence is to be utilized to express a protein, the nucleic acid 

20 sequence will contain at a minimum at least a portion of the sense or coding strand (/.<?., the nucleic 
acid sequence may be single-stranded). Alternatively, it may contain both the sense and anti-sense 
strands (i.e., the nucleic acid sequence may be double-stranded). 

As used herein, the term "purified" refers to molecules, either nucleic or amino acid 
sequences, that are removed from their natural environment, isolated or separated. An "isolated 

25 nucleic acid sequence" is therefore a purified nucleic acid sequence. "Substantially purified" 

molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free, 
from other components with which they are naturally associated. The present invention 
contemplates both purified (including substantially purified) and unpurified hybrid enzyme(s) (which 
are described in more detail below). 

30 As used herein, the terms "complementary" or "complementarity" are used in reference to 

nucleotide sequences related by the base-pairing rules. For example, the sequence 5'-AGT-3' is 
complementary to the sequence 5'-ACT-3\ Complementarity can be "partial" or "total." "Partial" 
complementarity is where one or more nucleic acid bases is not matched according to the base 
pairing rules. "Total" or "complete" complementarity between nucleic acids is where each and every 

35 nucleic acid base is matched with another base under the base pairing rules. The degree of 

complementarity between nucleic acid strands has significant effects on the efficiency and strength 
of hybridization between nucleic acid strands. 



WO 03/078637 PCT/IB03/01626 

-6- 

A "complement" of a nucleic acid sequence as used herein refers to a nucleotide sequence 
whose nucleic acids show total complementarity to the nucleic acids of the nucleic acid sequence. 
For example, the present invention contemplates the complements of SEQ ID NOS: 1, 3, 5, 9, 27, 28, 
29, 30, 31, 32, 33, 34, 35, 37, 38, 40, 41 and 43. 
5 The term "homology" when used in relation to nucleic acids refers to a degree of 

complementarity. There may be partial homology (Le. 9 partial identity) or complete homology (/.&, 
complete identity). A partially complementary sequence is one that at least partially inhibits a 
completely complementary sequence from hybridizing to a target nucleic acid and is referred to 
using the functional term "substantially homologous." The inhibition of hybridization of the 
1 0 completely complementary sequence to the target sequence may be examined using a hybridization 
assay (Southern or Northern blot, solution hybridization and the like) under conditions of low 
stringency. A substantially homologous sequence or probe (i.e., an oligonucleotide which is capable 
of hybridizing to another oligonucleotide of interest) will compete for and inhibit the binding (i.e., 
the hybridization) of a completely homologous sequence to a target under conditions of low 
1 5 stringency. This is not to say that conditions of low stringency are such that non-specific binding is 
permitted; low stringency conditions require that the binding of two sequences to one another be a 
specific {i.e., selective) interaction. The absence of non-specific binding may be tested by the use of 
a second target which lacks even a partial degree of complementarity (e.g., less than about 30% 
identity); in the absence of non-specific binding the probe will not hybridize to the second non- 
20 complementary target. 

When used in reference to a double-stranded nucleic acid sequence such as a cDNA or 
genomic clone, the term "substantially homologous" refers to any probe which can hybridize to 
either or both strands of the double-stranded nucleic acid sequence under conditions of low 
stringency as described infra. 
25 When used in reference to a single-stranded nucleic acid sequence, the term "substantially 

homologous" refers to any probe which can hybridize to the single-stranded nucleic acid sequence 
under conditions of low stringency as described infra. 

The term "hybridization" as used herein includes "any process by which a strand of nucleic 
acid joins with a complementary strand through base pairing." [Coombs J (1994) Dictionary of 
30 Biotechnology, Stockton Press, New York NY]. Hybridization and the strength of hybridization (i.e., 
the strength of the association between the nucleic acids) is impacted by such factors as the degree of 
complementarity between the nucleic acids, stringency of the conditions involved, the T m of the 
formed hybrid, and the G:C ratio within the nucleic acids. 

As used herein, the term "T m " is used in reference to the "melting temperature." The melting 
35 temperature is the temperature at which a population of double-stranded nucleic acid molecules 

becomes half dissociated into single strands. The equation for calculating the T m of nucleic acids is 
well known in the art. As indicated by standard references, a simple estimate of the T m value may be 
calculated by the equation: T m = 81 .5 + 0.41(% G + C), when a nucleic acid is in aqueous solution 
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at 1 M NaCl [see e.g., Anderson and Young, Quantitative Filter Hybridization, in: Nucleic Acid 
Hybridization (1985)]. Other references include more sophisticated computations which take 
structural as well as sequence characteristics into account for the calculation of T m . 

Low stringency conditions when used in reference to nucleic acid hybridization comprise 
conditions equivalent to binding or hybridization at 68°C in a solution consisting of 5X SSPE 
(Saline, Sodium Phosphate, EDTA) (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P<VH 2 0 and 1.85 g/1 EDTA 
(Ethylenediaminetetracetic Acid), pH adjusted to 7.4 with NaOH), 0.1% SDS (Sodium dodecyl 
sulfate), 5X Denhardt's reagent [SOX Denhardt's contains the following per 500 ml: 5 g Ficoll (Type 
400, Pharmacia), 5 g BSA (Bovine Serum Albumin) (Fraction V; Sigma)] and 100 ^ig/ml denatured 
salmon sperm DNA followed by washing in a solution comprising between 0.2X and 2.0X SSPE, 
and 0.1% SDS at room temperature when a DNA probe of about 100 to about 1000 nucleotides in 
length is employed. 

High stringency conditions when used in reference to nucleic acid hybridization comprise 
conditions equivalent to binding or hybridization at 68°C in a solution consisting of 5X SSPE, 1% 
SDS, 5X Denhardt r s reagent and 100 |Ag/ml denatured salmon sperm DNA followed by washing in a 
solution comprising 0.1X SSPE, and 0.1% SDS at 68°C when a probe of about 100 to about 1000 
nucleotides in length is employed. 

The term "equivalent" when made in reference to a hybridization condition as it relates to a 
hybridization condition of interest means that the hybridization condition and the hybridization 
condition of interest result in hybridization of nucleic acid sequences which have the same range of 
percent (%) homology. For example, if a hybridization condition of interest results in hybridization 
of a first nucleic acid sequence with other nucleic acid sequences that have from 50% to 70% 
homology to the first nucleic acid sequence, then another hybridization condition is said to be 
equivalent to the hybridization condition of interest if this other hybridization condition also results 
in hybridization of the first nucleic acid sequence with the other nucleic acid sequences that have 
from 50% to 70% homology to the first nucleic acid sequence. 

When used in reference to nucleic acid hybridization the art knows well that numerous 
equivalent conditions may be employed to comprise either low or high stringency conditions; factors 
such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target 
(DNA, RNA, base composition, present in solution or immobilized) and the concentration of the 
salts and other components (e.g., the presence or absence of formamide, dextran sulfate, 
polyethylene glycol) are considered and the hybridization solution may be varied to generate 
conditions of either low or high stringency hybridization different from, but equivalent to, the above- 
listed conditions. 

The term "promoter," "promoter element," or "promoter sequence" as used herein, refers to a 
DNA sequence which when ligated to a nucleotide sequence of interest is capable of controlling the 
transcription of the nucleotide sequence of interest into mRNA. A promoter is typically, though not 
necessarily, located 5' (i.e., upstream) of a nucleotide sequence of interest whose transcription into 
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mRNA it controls, and provides a site for specific binding by RNA polymerase and other 
transcription factors for initiation of transcription. 

Promoters may be tissue specific or cell specific. The term "tissue specific" as it applies to a 
promoter refers to a promoter that is capable of directing selective expression of a nucleotide 
5 sequence of interest to a specific type of tissue (e.g., petals) in the relative absence of expression of 
the same nucleotide sequence of interest in a different type of tissue (e.g., roots). Tissue specificity 
of a promoter may be evaluated by, for example, operably linking a reporter gene to the promoter 
sequence to generate a reporter construct, introducing the reporter construct into the genome of a 
plant such that the reporter construct is integrated into every tissue of the resulting transgenic plant, 
10 and detecting the expression of the reporter gene (e.g., detecting mRNA, protein, or the activity of a 
protein encoded by the reporter gene) in different tissues of the transgenic plant. The detection of a 
greater level of expression of the reporter gene in one or more tissues relative to the level of 
expression of the reporter gene in other tissues shows that the promoter is specific for the tissues in 
which greater levels of expression are detected. The term "cell type specific" as applied to a 
15 promoter refers to a promoter which is capable of directing selective expression of a nucleotide 
sequence of interest in a specific type of cell in the relative absence of expression of the same 
nucleotide sequence of interest in a different type of cell within the same tissue. The term "cell type 
specific" when applied to a promoter also means a promoter capable of promoting selective 
expression of a nucleotide sequence of interest in a region within a single tissue. Cell type 
20 specificity of a promoter may be assessed using methods well known in the art, e.g., immuno- 

histochemical staining. Briefly, tissue sections are embedded in paraffin, and paraffin sections are 
reacted with a primary antibody which is specific for the polypeptide product encoded by the 
nucleotide sequence of interest whose expression is controlled by the promoter. A labeled (e.g., 
peroxidase conjugated) secondary antibody which is specific for the primary antibody is allowed to 
25 bind to the sectioned tissue and specific binding detected (e.g., with avidin/biotin) by microscopy. 
Promoters may be constitutive or regulatable. The term "constitutive" when made in 
reference to a promoter means that the promoter is capable of directing transcription of an operably 
linked nucleic acid sequence in the absence of a stimulus (e.g., heat shock, chemicals, light, or 
similar stimuli). Typically, constitutive promoters are capable of directing expression of a transgene 
30 in substantially any cell and any tissue. In contrast, a "regulatable" promoter is one which is capable 
of directing a level of transcription of an operably linked nuclei acid sequence in the presence of a 
stimulus (e.g., heat shock, chemicals, light, or similar stimuli) which is different from the level of 
transcription of the operably linked nucleic acid sequence in the absence of the stimulus. 

The terms "infecting" and "infection" with a bacterium refer to co-incubation of a target 
35 biological sample, (e.g., cell, tissue, plant part) with the bacterium under conditions such that nucleic 
acid sequences contained within the bacterium are introduced into one or more cells of the target 
biological sample. 
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The term "Agrobacterium" refers to a soil-borne, Gram-negative, rod-shaped 
phytopathogenic bacterium which causes crown gall. The term "Agrobacterium" includes, but is not 
limited to, the strains Agrobacterium tumefaciens, (which typically causes crown gall in infected 
plants), and Agrobacterium rhizogens (which causes hairy root disease in infected host plants). 

5 Infection of a plant cell with Agrobacterium generally results in the production of opines (e.g., 
nopaline, agropine, octopine) by the infected cell. Thus, Agrobacterium strains which cause 
production of nopaline (e.g., strain LBA4301, C58, A208) are referred to as "nopaline-type" 
Agrobacteria; Agrobacterium strains which cause production of octopine (e.g., strain LBA4404, 
Ach5, B6) are referred to as "octopine-type" Agrobacteria; and Agrobacterium strains which cause 

10 production of agropine (e.g., strain EHA105, EHA101, A281) are referred to as "agropine-type" 
Agrobacteria. 

The terms "bombarding, "bombardment," and "biolistic bombardment" refer to the process 
of accelerating particles towards a target biological sample (e.g., cell, tissue, plant part - such as a 

leaf, or intact plant) to effect wounding of the cell membrane of a cell in the target biological sample 
1 5 and/or entry of the particles into the target biological sample. Methods for biolistic bombardment 

are known in the art (e.g., U.S. Patent Nos. 5,584,807 and 5,141,131, the contents of both are herein 

incorporated by reference), and are commercially available (e.g., the helium gas-driven 

microprojectile accelerator (PDS-1000/He) (BioRad). 

The term "microwounding" when made in reference to plant tissue refers to the introduction 
20 of microscopic wounds in that tissue. Microwounding may be achieved by, for example, particle 

bombardment as described herein. The present invention specifically contemplates schemes for 

introducing nucleic acid which employ microwounding. 

The term "organism" as used herein refers to all organisms and in particular organisms 

containing glycoproteins with n-linked glycans. 
25 The term "plant" as used herein refers to a plurality of plant cells which are largely 

differentiated into a structure that is present at any stage of a plant's development. Such structures 

include, but are not limited to, a fruit, shoot, stem, root, leaf, seed, flower petal, or similar structure. 

The term "plant tissue" includes differentiated and undifferentiated tissues of plants including, but 

not limited to, roots, shoots, leaves, pollen, seeds, tumor tissue and various types of cells in culture 
30 (e.g., single cells, protoplasts, embryos, callus, protpcorm-like bodies, and other types of cells). 

Plant tissue may be in planta, in organ culture, tissue culture, or cell culture. Similarly, "plant cells" 

may be cells in culture or may be part of a plant. 

Glycosyltransferases are enzymes that catalyze the processing reactions that determine the 

structures of cellular oligosaccharides, including the oligosaccharides on glycoproteins. As used 
35 herein, "glycosyltransferase" is meant to include mannosidases, even though these enzymes trim 

glycans and do not "transfer" a monosaccharide. Glycosyltransferases share the feature of a type II 

membrane orientation. Each glycosyltransferase is comprised of an amino terminal cytoplasmic tail 

(shown for illustration purposes below as a made up of a string of amino acids arbitrarily labeled "X" 
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- without intending to suggest the actual size of the region), a signal anchor domain (shown below as 
made up of a string of amino acids labeled "H" for hydrophobic - without intending to suggest the 
actual size of the domain and without intending to suggest that the domain is only made up of 
hydrophobic amino acids) that spans the membrane (referred to herein as a 'transmembrane 
5 domain"), followed by a luminal stem (shown below as made up of a string of amino acids arbitrarily 
labeled "S" - without intended to suggest the actual size of the region) or stalk region, and a 
carboxy-terminal catalytic domain (shown below as made up of a string of amino acids arbitrarily 
labeled "C" - without intending to suggest the actual size of the domain: 

NH.- XXXXXXHHHHHHHHSSSSSSSS CCCCCCCC 

1 0 Collectively, The Cytoplasmic Tail-Transmembrane-Stem Region or "CTS" (which has been 
underlined in the above schematic for clarity) can be used (or portions thereof) in embodiments 
contemplated by the present invention wherein the catalytic domain is exchanged or "swapped" with 
a corresponding catalytic domain from another molecule (or portions of such regions/domains) to 
create a hybrid protein. 

15 

For example, in a preferred embodiment, the present invention contemplates nucleic acid 
encoding a hybrid enzyme (as well as vectors containing such nucleic acid, host cells containing 
such vectors, and the hybrid enzyme itself), said hybrid enzyme comprising at least a portion of a 
CTS region [e.g., the cytoplasmic tail ("C"), the transmembrane domain ("T"), the cytoplasmic tail 

20 together with the transmembrane domain ("CT")> the transmembrane domain together with the stem 
("TS"), or the complete CTS region] of a first glycosyltransferase (e.g. plant glycosyltransferase) and 
at least a portion of a catalytic region of a second glycosyltransferase (e.g. mammalian 
glycosyltransferase). To create such an embodiment, the coding sequence for the entire CTS region 
(or portion thereof) may be deleted from nucleic acid coding for the mammalian glycosyltransferase 

25 and replaced with the coding sequence for the entire CTS region (or portion thereof) of a plant 
glycosyltransferase. On the other hand, a different approach might be taken to create this 
embodiment; for example, the coding sequence for the entire catalytic domain (or portion thereof) 
may be deleted from the coding sequence for the plant glycosyltransferase and replaced with the 
coding sequence for the entire catalytic domain (or portion thereof) of the mammalian 

30 glycosyltransferase. In such a case, the resulting hybrid enzyme would have the amino-terminal 
cytoplasmic tail of the plant glycosyltransferase linked to the plant glycosyltransferase 
transmembrane domain linked to the stem region of the plant glycosyltransferase in the normal 
manner of the wild-type plant enzyme - but the stem region would be linked to the catalytic domain 
of the mammalian glycosyltransferase (or portion thereof). 

35 It is not intended that the present invention be limited only to the two approaches outlined 

above. Other variations in the approach are contemplated. For example, to create nucleic acid 
encoding a hybrid enzyme, said hybrid enzyme comprising at least a portion of a transmembrane 
region of a plant glycosyltransferase and at least a portion of a catalytic region of a mammalian 
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glycosyltransferase, one might use less than the entire coding sequence for the CTS region (e.g., only 
the transmembrane domain of the plant glycosytransferase, or the complete cytoplasmic tail together 
with all or a portion of the transmembrane domain, or the complete cytoplasmic tail together with all 
of the transmembrane domain together with a portion of the stem region). One might delete the 
5 mammalian coding sequence for the entire cytoplasmic tail together with the coding sequence for the 
transmembrane domain (or portion thereof) - followed by replacement with the corresponding 
coding sequence for the cytoplasmic tail and transmembrane domain (or portion thereof) of the plant 
glycosyltransferase. In such a case, the resulting hybrid enzyme would have the stem region of the 
mammalian glycosyltransferase linked to the plant glycosyltransferase transmembrane domain (or 
1 0 portion thereof) which in turn would be linked to the amino-terminal cytoplasmic tail of the plant 
glycosyltransferase, with the stem region being linked to the catalytic domain of the mammalian 
glycosyltransferase (i.e. two of the four regions/domains would be of plant origin and two would be 
of mammalian origin). 

In other embodiments, the present invention contemplates nucleic acid encoding a hybrid 
1 5 enzyme (along with vectors, host cells containing the vectors, plants - or plant parts - containing the 
host cells), said hybrid enzyme comprising at least a portion of an amino-terminal cytoplasmic tail of 
a plant glycosyltransferase and at least a portion of a catalytic region of a mammalian 
glycosyltransferase. In this embodiment, the hybrid enzyme encoded by the nucleic acid might or 
might not contain other plant sequences (e.g., the transmembrane domain or portion thereof, the stem 
20 region or portion thereof). For example, to create such an embodiment, the coding sequence for the 
entire cytoplasmic tail (or portion thereof) may be deleted from nucleic acid coding for the 
mammalian glycosyltransferase and replaced with the coding sequence for the entire cytoplasmic 
domain (or portion thereof) of a plant glycosyltransferase. In such a case, the resulting hybrid 
enzyme would have the amino-terminal cytoplasmic tail (or portion thereof) of the plant 
25 glycosyltransferase linked to the mammalian glycosyltransferase transmembrane domain, which in 
turn is linked to stem region of the mammalian glycosyltransferase, the stem region being linked to 
the catalytic domain of the mammalian glycosyltransferase. On the other hand, a different approach 
might be taken to create this embodiment; for example, the coding sequence for the entire catalytic 
domain (or portion thereof) may be deleted from the coding sequence for the plant 
30 glycosyltransferase and replaced with the coding sequence for the entire catalytic domain (or portion 
thereof) of the mammalian glycosyltransferase. In such a case, the resulting hybrid enzyme would 
have the amino-terminal cytoplasmic tail of the plant glycosyltransferase linked to the plant 
glycosyltransferase transmembrane domain linked to the stem region of the plant glycosyltransferase 
in the normal manner of the wild-type plant enzyme - but the stem region would be linked to the 
35 catalytic domain of the mammalian glycosyltransferase (or portion thereof). 

In the above discussion, the use of the phrase "or portion thereof was used to expressly 
indicate that less than the entire region/domain might be employed in the particular case (e.g., a 
fragment might be used). For example, the cytoplasmic tail of glycosyltransferases ranges from 
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approximately 5 to 50 amino acids in length, and more typically 15 to 30 amino acids, depending on 
the particular transferase. A "portion" of the cytoplasmic tail region is herein defined as no fewer 
than four amino acids and can be as large as up to the full length of the region/domain less one 
amino acid. It is desired that the portion function in a manner analogous to the full length 
5 region/domain - but need not function to the same degree. For example, to the extent the full-length 
cytoplasmic tail functions as a Golgi retention region or ER retention signal, it is desired that the 
portion employed in the above-named embodiments also function as a Golgi or ER retention region, 
albeit perhaps not as efficiently as the full-length region. 

Similarly, the transmembrane domain is typically 15-25 amino acids in length and made up 

10 of primarily hydrophobic amino acids. A "portion" of the transmembrane domain is herein defined 
as no fewer than ten amino acids and can be as large as up to the full length of the region/domain 
(for the particular type of transferase) less one amino acid. It is desired that the portion function in a 
manner analogous to the full length region/domain - but need not function to the same degree. For 
example, to the extent the full-length transmembrane domain functions as the primary Golgi 

15 retention region or ER retention signal, it is desired that the portion employed in the above-named 
embodiments also function as a Golgi or ER retention region, albeit perhaps not as efficiently as the 
full-length region. The present invention specifically contemplates conservative substitutions to 
create variants of the wild-type transmembrane domain or portions thereof. For example, the present 
invention contemplates replacing one or more hydrophobic amino acids (shown as "H" in the 

20 schematic above) of the wild-type sequence with one or more different amino acids, preferably also 
hydrophobic amino acids. 

A portion of the catalytic domain can be as large as the full length of the domain less on 
amino acid. Where the catalytic domain is from a betal,4-galactosyltransferase, it is preferred that 
the portion include at a minimum residues 345-365 which are believed to be involved in the 

25 conformation conferring an oligosaccharide acceptor binding site (it is preferred that the portion 
include this region at a minimum and five to ten amino acids on either side to permit the proper 
conformation). 

The present invention also includes synthetic CTS regions and portions thereof. A "portion" 
of a CTS region must include at least one (and may include more than one) entire domain (e.g., the 
30 entire transmembrane domain) but less than the entire CTS region. 

Importantly, by using the term "CTS region" or "transmembrane domain" it is not intended 
that only wild type sequences be encompassed. Indeed, this invention is not limited to natural 
glycosyltransferases and enzymes involved in glycosylation, but also includes the use of synthetic 
enzymes exhibit the same or similar function. In one embodiment, wild type domains are changed 
35 (e.g. by deletion, insertion, replacement and the like). 

Finally, by using the indicator "Tm" when referring to a particular hybrid (e.g., "TmXyl-), 
entire transmembrane/CTS domains (with or without changes to the wild-type sequence) as well as 
portions (with or without changes to the wild-type sequence) are intended to be encompassed. 
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SUMMARY OF THE INVENTION 

The present invention contemplates nucleic acid (whether DNA or RNA) encoding hybrid 
enzymes (or "fusion proteins"), vectors containing such nucleic acid, host cells (including but not 
limited to cells in plant tissue and whole plants) containing such-vectors an expressing the hybrid 
enzymes, and the isolated hybrid enzyme(s) themselves. In one embodiment, expression of said 
hybrid enzymes (or "fusion proteins") results in changes in glycosylation, such as, but not limited to, 
reduction of sugar moieties such as xylose, fucose, Lewis^or other sugar structures that interfere 
with desired glycoform accumulation. In one embodiment, the present invention contemplates 
nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a CTS region (or portion 
thereof) of a glycosyltransferase (including but not limited to a plant glycosyltransferase) and a 
catalytic region (or portion thereof) of a non-plant glycosyltransferase (e.g., mammalian, fish, 
amphibian, fungal). It is preferred that, when expressed, the CTS region (or portion thereof) is 
linked (directly or indirectly) in operable combination to said catalytic region (or portion thereof). 
The linking is preferably covalent and the combination is operable in that the catalytic region 
exhibits catalytic function (even if said catalytic function is reduced as compared to the wild-type 
enzyme). The linking can be direct in the sense that there are no intervening amino acids or other 
regions/domains. On the other hand, the linking can be indirect in that there are intervening amino 
acids (or other chemical groups) and/or other regions/domains between them. Of course, the nucleic 
acid used to make the nucleic acid encoding the above-described hybrid enzyme(s) can be obtained 
enzymatically from a physical sequence (e.g. genomic DNA, a cDNA, and the like) or alternatively, 
made synthetically using a reference sequence (e.g. electronic or hardcopy sequence) as a guide. 

In a particular embodiment, the present invention contemplates nucleic acid encoding a 
hybrid enzyme, said hybrid enzyme comprising a transmembrane region {e.g., at least a 
transmembrane region and optionally more of the CTS region) of a plant glycosyltransferase and a 
catalytic region (or portion thereof) of a non-plant (such as a mammalian) glycosyltransferase. 
Again, it is preferred that, when expressed, these regions are linked (directly or indirectly) in 
operable combination. In yet another embodiment, the present invention contemplates nucleic acid 
encoding a hybrid enzyme, said hybrid enzyme comprising a transmembrane domain (or portion 
thereof) of a plant glycosyltransferase and a catalytic region (or portion thereof) of a mammalian 
glycosyltransferase. Again, it is preferred that, when expressed, these regions are linked (directly or 
indirectly) in operable combination. 

It is not intended that the present invention be limited to particular transferases. In one 
embodiment, the plant glycosyltransferase is a xylosyltransferase. In another embodiment, the plant 
glycosyltransferase is a N-acetylglucosaminyltransferase. In another embodiment, the plant 
glycosyltransferase is a fucosyltransferase. In a preferred embodiment, the mammalian 
glycosyltransferase is a human galactosyltransferase (such as the human beta 1 ,4- 
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galactosyltransferase encoded by SEQ ID NO:l wherein the nucleotides encoding the 
transmembrane domain are deleted and replaced). 

It is not intended that the present invention is limited to the use of a plant-derived 
glycosyltransferase CTS-domain and a human glycosyltransferase catalytic domain but also vice 
5 versa and the use of any CTS-domain of a glycosyltransferase in combination with the catalytic 
fragment of at least one other glycosyltransferase. Indeed, the present invention broadly 
contemplates, in one embodiment, nucleic acid encoding a hybrid enzyme, said hybrid enzyme 
comprising a transmembrane region of a first glycosyltransferase and a catalytic region of a second 
glycosyltransferase. It is preferred that said first and second glycosyltransferases are from different 

10 species (and can be from a different genus or even from a different phylum). In one embodiment, 
said first glycosyltransferase comprises a plant glycosyltransferase. In another embodiment, said 
plant glycosyltransferase is a xylosyltransferase. In yet another embodiment, said plant 
glycosyltransferase is a fucosyltransferase. In a preferred embodiment said second 
glycosyltransferase comprises a mammalian glycosyltransferase. In a particularly preferred 

1 5 embodiment, said mammalian glycosyltransferase is a human galactosyltransferase. 

It is not intended that the present invention be limited to circumstances where the first and 
second glycosyltransferases are plant and non-plant, respectively. In one embodiment, said first 
glycosyltransferase comprises a first mammalian glycosyltransferase and said second 
glycosyltransferase comprises a second mammalian glycosyltransferase. In a preferred embodiment, 

20 said first mammalian glycosyltransferase is a non-human glycosyltransferase and said second 
mammalian glycosyltransferase is a human glycosyltransferase. 

It is not intended that the present invention be limited to the type of vector. In one 
embodiment, the present invention contemplates an expression vector, comprising the nucleic acid 
encoding the above-described hybrid enzyme. 

25 It is also not intended that the present invention be limited to the type of host cells. A variety 

of prokaryotic and eukaryotic host cells are commercially available for expressing proteins. In one 
embodiment, the present invention contemplates a host cell containing the vector comprising the 
nucleic acid encoding the above-described hybrid enzyme (with or without other vectors or other 
nucleic acid encoding other hybrid enzymes or glycosyltransferases). In a preferred embodiment, the 

30 host cell is a plant cell. In a particularly preferred embodiment, the present invention contemplates a 
plant comprising such a host cell. 

It is not intended that the present invention be limited by the method by which host cells are 
made to express the hybrid enzymes of the present invention. In one embodiment, the present 
invention contemplates a method, comprising: a) providing: i) a host cell (such as a plant cell, 

35 whether in culture or as part of plant tissue or even as part of an intact growing plant), and ii) an 
expression vector comprising nucleic acid encoding a hybrid enzyme, said hybrid enzyme 
comprising at least a portion of a CTS region of a plant glycosyltransferase (e.g. the transmembrane 
domain) and at least a portion of a catalytic region of a mammalian glycosyltransferase; and b) 
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introducing said expression vector into said plant cell under conditions such that said hybrid enzyme 
is expressed. Again, it is not intended that the present invention be limited to particular transferases. 
In one embodiment, the plant glycosyltransferase used in the above-described method is a 
xylosyltransferase. In another embodiment, the plant glycosyltransferase is aN- 
acetylglucosaminyltransferase. In another embodiment, the plant glycosyltransferase is a 
facosyltransferase. In a preferred embodiment, the mammalian glycosyltransferase used in the 
above-described method is a human galactosyltransferase (such as the human beta 1,4- 
galactosyltransferase encoded by SEQ ID NO:l wherein the nucleotides encoding the 
transmembrane domain are deleted and replaced) (or simply where the nucleotides of SEQ IDNO:l 
encoding the catalytic domain, or portion thereof, are taken and linked to nucleotides encoding the 
CTS region, or portion thereof, of a plant glycosyltransferase.). 

It is not intended that the present invention be limited to a particular scheme for controlling 
glycosylation of a heterologous protein using the hybrid enzymes described above. In one 
embodiment, the present invention contemplates a method, comprising: a) providing: i) a host cell 
(such as a plant cell), ii) a first expression vector comprising nucleic acid encoding a hybrid enzyme, 
said hybrid enzyme comprising at least a portion of a CTS region (e.g. at least a transmembrane 
domain) of a first (such as a plant) glycosyltransferase and at least a portion of a catalytic region of a 
second (such as a mammalian) glycosyltransferase, and iii) a second expression vector comprising 
nucleic acid encoding a heterologous glycoprotein; (or portion thereof; and b) introducing said first 
and second expression vectors into said plant cell under conditions such that said hybrid enzyme and 
said heterologous protein are expressed. Alternatively, a single vector with nucleic acid encoding 
both the hybrid enzyme (or hybrid enzymes) and the heterologous glycoprotein might be used. 
Regardless of which method is used, the invention contemplates, in one embodiment, the additional 
step (c) of isolating the heterologous protein - as well as the isolated protein itself as a composition. 

On the other hand, the present invention also contemplates introducing different vectors into 
different plant cells (whether they are cells in culture, part of plant tissue, or even part of an intact 
growing plant). In one embodiment, the present invention contemplates a method, comprising: a) 
providing: i) a first plant comprising a first expression vector, said first vector comprising nucleic 
acid encoding a hybrid enzyme (or encoding two or more hybrid enzymes), said hybrid enzyme 
comprising at least a portion of a CTS region (e.g. the first approximately 40-60 amino acids of the 
N-terminus) of a plant glycosyltransferase and at least a portion of a catalytic region of a mammalian 
glycosyltransferase, and ii) a second plant comprising a second expression vector, said second vector 
comprising nucleic acid encoding a heterologous protein (or portion thereof); and crossing said first 
plant and said second plant to produce progeny expressing said hybrid enzyme and said heterologous 
protein. Of course, such progeny can be isolated, grown up, and analyzed for the presence of each 
(or both) of the proteins. Indeed, the heterologous protein can be used (typically first purified 
substantially free of plant cellular material) therapeutically (e.g., administered to a human or animal, 
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whether orally, by intravenous, transdermal^ or by some other route of administration) to treat or 
prevent disease. 

It is not intended that the present invention be limited to a particular heterologous protein. In 
one embodiment, any peptide or protein that is not endogenous to the host cell (or organism) is 

5 contemplated. In one embodiment, the heterologous protein is an antibody or antibody fragment. In 
a particularly preferred embodiment, the antibody is a human antibody or "humanized" antibody 
expressed in a plant in high yield. "Humanized" antibodies are typically prepared from non-human 
antibodies (e.g. rodent antibodies) by taking the hypervariable regions (the so-called CDRs) of the 
non-human antibodies and "grafting" them on to human frameworks. The entire process can be 

1 0 synthetic (provided that the sequences are known) and frameworks can be selected from a database 
of common human frameworks. Many times, there is a loss of affinity in the process unless either 
the framework sequences are modified or the CDRs are modified. Indeed, increases in affinity can 
be revealed when the CDRs are systematically mutated (for example, by randomization procedures) 
and tested. 

1 5 While the present invention is particularly useful in the context of heterologous proteins, in 

one embodiment, the hybrid enzymes of the present invention are used to change the glycosylation of 
endogenous proteins, i.e. proteins normally expressed by the host cell or organism. 

The present invention specifically contemplates the plants themselves. In one embodiment, 
the present invention contemplates a plant, comprising first and second expression vectors, said first 

20 vector comprising nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising at least a 
portion of a CTS region (e.g. the cytoplasmic tail together with at least a portion of the 
transmembrane domain) of a plant glycosyltransferase and at least a portion of a catalytic region of a 
mammalian glycosyltransferase, said second expression vector, said second vector comprising 
nucleic acid encoding a heterologous protein (or portion thereof). In a preferred embodiment, by 

25 virtue of being expressed along with the hybrid enzyme (or hybrid enzymes) of the present invention, 
the heterologous protein displays reduced (10% to 99%) alpha 1,3 -fucosylation (or even no 
fucosylation), as compared to when the heterologous protein is expressed in the plant in the absence 
of the hybrid enzyme (or enzymes). In a preferred embodiment, by virtue of being expressed along 
with the hybrid enzyme (or hybrid enzymes) of the present invention, the heterologous protein 

30 displays reduced (10% to 99%) xylosylation (or even no xylose), as compared to when the 

heterologous protein is expressed in the plant in the absence of the hybrid enzyme (or enzymes). In a 
preferred embodiment, by virtue of being expressed along with the hybrid enzyme (or hybrid 
enzymes) of the present invention, the heterologous protein displays both reduced fucose and xylose, 
as compared to when the heterologous protein is expressed in the plant in the absence of the hybrid 

35 enzyme (or enzymes). 

It is not intended that the present invention be limited to a particular theory by which reduced 
fucose and/or xylose is achieved. Very little is known about the sub-Golgi sorting mechanism in 
plants. The mammalian specific p(l,4)-galactosyltransferase (GalT) has been used (see the 
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Examples below) as an excellent first marker to study this phenomenon since it generates glycan 
structures not normally found in plants. The glycan structures of plants that express 
galactosyltransferase has been compared with glycan structures from plants that express a chimeric 
galactosyltransferase of which the CTS domain is exchanged for that of a plant xylosyltransferase (or 
5 portion thereof). The change in observed glycan structures show that the galactosyltransferase is, as 
in mammals, confined to a specific sub-compartment of the plant Golgi. Without limiting the 
invention to any particular mechanism, the sorting mechanism of plants and mammals are apparently 
conserved even to the extent that glycosyltransferases unknown to plants are routed to specific 
analogous location in the Golgi. This location is later in the Golgi than where the endogenous 
10 xylosyl-, fiicosyl- and GlcNAcTII (GnTII) transferases are located. 

The finding that N-glycans in these plants that express relocalised variants of GalT 
containing significantly less xylose and fucose is also of biotechnological relevance. For 
glycoproteins intended for therapeutic use in mammals, such as humans, the approach of certain 
embodiments of the present invention provides methods and compositions for controlling N-linked 
1 5 glycosylation of glycoproteins in plants so that glycoprotein essentially free of xylose and fucose and 
containing at least a bi-antennary N-glycans (but not limited to bi-antennary, also include tri- 
antennary, and the like) and (at least one) galactose residue on at least one of the arms of the N- 
glycan can be obtained. Hence, it is not intended that the present invention is limited to bi-antennary 
N-glycans but also includes bisected bi-antennary N-glycans, tri-antennary N-glycans, and the like. 
20 Furthermore, the invention is not limited to complex-type N-glycans but also includes hybrid-type 
N-glycans and other type N-glycans. The present invention contemplates such resulting glyco- 
proteins. In addition, the methods and compositions of the present invention may be applicable for 
plants and non-plant systems where besides xylose, fucose, Lewis*** 7 * type N-glycan modifications 
(pl-3-GalT, al-4-FucT, other) or other sugars, "interfere" with desired glycoform accumulation. 
25 In one embodiment, the invention is directed to controlling N-linked glycosylation of plants 

by modulating the localization of enzymes involved in glycan biosynthesis in the Golgi apparatus. 
Specifically, embodiments of the invention are directed to a method of producing in a plant host 
system a glycoprotein having bi-antennary glycans and containing at least one galactose residues on 
at least one of the arms and which are devoid (or reduced in) of xylose and fiicose, comprising:(a) 
30 preventing (or inhibiting) addition of xylose and fucose on the core of the glycan of said glycoprotein 
and (b) adding one or preferably two galactose residues to said arms. 

Addition of xylose and fucose to said heterologous glycoprotein may be reduced or even 
prevented by introducing to said plant host system a nucleic acid encoding a hybrid enzyme 
comprising a CTS region (or portion thereof) of a protein, particularly an enzyme such as plant 
35 xylosyltransferase and catalytic region (or portion thereof) of a galactosyltransferase not normally 
found in a plant, or a modified galactosyltransferase where its transmembrane portion has been 
removed and endoplasmic reticulum retention signal have been inserted, wherein said protein or 
enzyme acts earlier in the Golgi apparatus of a plant cell in said plant host system than said 



WO 03/078637 PCT/1B03/01626 

-18- 

galactosyltransferase. It is preferred that the galactosyltransferase is a mammalian 
galactosyltransferase and in particular, a human galactosyltransferase. In a most specific 
embodiment, said galactosyltransferase is human pi, 4 galactosyltransferase (GalT). In a preferred 
embodiment, said xylosyltransferase is a (31,2-xylosyltransferase. The exchange of the CTS region or 
5 CTS fragment of a mammalian glycosyltransferase (such as a galactosyltransferase) by one from the 
group of enzymes that act earlier in the Golgi apparatus than galactosyltransferase including but not 
limited to those from of XylT, FucT, GnTI, GnTII, GnTffl, GnTTV, GnTV, GnTVI, ManI, ManH and 
Manm results in strongly reduced amounts of glycans that contain the undesired xylose and fucose 
residues (see Figure 2). In addition, galactosylation is improved and the diversity in glycans is 

10 reduced. While not limited to any particular mechanism, the increase in galactosylated glycans that 
carry neither xylose nor fucose is believed to be mainly attributed to the accumulation of 
GalGNManS, GNMan5 or GalGNMan4. Also, galactosylation occurs on one glycan arm only. 
Apparently, the galactosylation earlier in the Golgi inhibits trimming of the said glycoforms by 
Mannosidase II (ManH) to GalGNMan3. Also addition of the second GlcNAc by GlcNAcTH 

15 (GnTII) is inhibited. 

Therefore, in one embodiment, a further step is contemplated to obtain the desired 
glycoprotein that has both arms galactosylated and yet is essentially devoid of xylose and fucose. 
Thus, in one embodiment, the method of the invention as noted above further comprises adding 
galactose residues to the arms of said glycoprotein (see Figure 3). In one embodiment of the 

20 invention, galactose residues are added onto both arms by introducing to said plant host system (a) a 
nucleic acid sequence encoding a first hybrid enzyme comprising the CTS region (or fragment, such 
as one including the transmembrane domain) of GnTI and the active domain (or portion thereof) of 
GnTII; (b) a nucleic acid sequence encoding the second hybrid enzyme comprising the CTS region 
(or fragment, such as one including the transmembrane of GnTI and the active domain of ManH and 

25 (c) a nucleic acid sequence encoding a third hybrid enzyme comprising the CTS region (or fragment, 
such as one including the transmembrane domain) of XylT and the active domain (or portion 
thereof) of human galactosyltransferse (TmXyl-GalT). In another embodiment of the invention, 
galactose residues are added onto both arms by introducing to said plant host system (a) a nucleic 
acid sequence encoding a first hybrid enzyme comprising the CTS region (or fragment, such as one 

30 including the transmembrane domain) of ManI and the active domain (or portion thereof) of GnTI; 
(b) a nucleic acid sequence encoding the second hybrid enzyme comprising the CTS region (or 
fragment, such as one including the transmembrane domain) of ManI and the active domain (or 
portion thereof) of GnTII; (c) a nucleic acid sequence encoding the third hybrid enzyme comprising 
the CTS region (or fragment, such as one including the transmembrane domain) of ManI and the 

35 active domain (or portion thereof) of Manll, and (d) a nucleic acid sequence encoding a fourth 

hybrid enzyme comprising the CTS region (or fragment, such as one including the transmembrane 
domain) of XylT and the active domain (or portion thereof) of human galactosyltransferse (TmXyl- 
GalT). 
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It is not intended that the present invention be limited to particular combinations of hybrid 
enzymes or the number of such hybrid enzymes employed in a single cell, plant tissue or plant. In a 
preferred embodiment, the present invention contemplates host cells expressing TmXyl-GalT plus 
TmGnTI-GnTII plus TmGnTI-ManIL In one embodiment of the invention, galactose residues are 

5 added to said arms by introducing to said plant host system (a) a nucleic acid sequence encoding a 
first hybrid enzyme comprising a CTS region (or fragment thereof) of a protein, particularly an 
enzyme, including but not limited to N-acetylglucosaminyltransferase I (GnTI) and a catalytic region 
(or portion thereof) of a mannosidase II (Manll), wherein said enzyme acts earlier in the Golgi 
apparatus of a plant cell in said plant host system than said mannosidase II or modified mannosidase 

10 II where its transmembrane portion has been deleted and endoplasmic reticulum retention signal 
have been inserted and (b) a nucleic acid sequence encoding a second hybrid enzyme comprising a 
CTS region (or fragment, such as one including the transmembrane domain) of an enzyme including 
but not limited to N-acetyl-glucosaminyltransferase I (GnTI) and a catalytic region (or portion 
thereof) of a N-acetylglucosaminyl-transferase II (GnTII), wherein said enzyme acts earlier in the 

1 5 Golgi apparatus of a plant cell in said plant host system than said N acetylglucosaminyl-transferasell 
(GnTII) or modified N-acetylglucosaminyltransferase II (GnTII) where its transmembrane portion 
has been deleted and an endoplasmic reticulum retention signal have been inserted. The sequences 
encoding N-acetylglucosaminyltransferases or mannosidase II or the said transmembrane fragments 
can originate form plants or from eukaryotic non-plant organisms (e.g., mammals). 

20 In yet another preferred embodiment, the present invention contemplates a host cell 

expressing TmXyl-GalT plus TmManl-GnTI plus TmManl-Manll plus TmManl-GnTII. In another 
embodiment of the invention, galactose residues are added to said arms by introducing to said plant 
host system (a) a nucleic acid sequence encoding a first hybrid enzyme comprising a CTS region (or 
fragment, such as one including the transmembrane domain) of a protein, particularly an enzyme, 

25 including but not limited to Mannosidase I (ManI) and a catalytic region (or portion thereof) of a N 
acetylglucosaminyltransferase I (GnTI), wherein said enzyme acts earlier in the Golgi apparatus of a 
plant cell in said plant host system than said N-acetylglucosaminyl-transferase I (GnTI) or modified 
N acetylglucosaminyltransferase I (GnTI) where its transmembrane portion has been deleted and 
endoplasmic reticulum retention signal have been inserted and (b) a nucleic acid sequence encoding 

30 a second hybrid enzyme comprising a CTS region (or fragment, such as one including the 

transmembrane domain) of an enzyme including but not limited to Mannosidase I (ManI) and a 
catalytic region (or portion thereof) of a Mannosidase II (Manll), wherein said enzyme acts earlier 
in the Golgi apparatus of a plant cell in said plant host system than said Mannosidase II (Manll) or 
modified Mannosidase II (Manll) where its transmembrane portion has been deleted and an 

35 endoplasmic reticulum retention signal have been inserted and (c) a nucleic acid sequence encoding 
a third hybrid enzyme comprising a CTS region (or fragment, such as one including the 
transmembrane domain) of an enzyme including but not limited to Mannosidase I (ManI) and a 
catalytic region (or portion thereof) of a N-acetylglucos-aminyltransferase II (GnTII), wherein said 
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enzyme acts earlier in the Golgi apparatus of a plant cell in said plant host system than said N- 
acetylglucosaminyltransferase II (GnTII) or modified N-acetylglucosaminyltransferase II (GnTII) 
where its transmembrane portion has been deleted and an endoplasmic reticulum retention signal 
have been inserted. The sequences encoding N-acetylglucosaminyltransferases or mannosidases or 
5 the said transmembrane fragments can originate from plants or from eukaryotic non-plant organisms 
(e.g., mammals). 

In still another preferred embodiment, the present invention contemplates host cells 
expressing TmXyl-GalT plus ManllL In another embodiment of the invention, galactose residues 
are added to said arms by introducing to said plant host system (a) a nucleic acid sequence encoding 
10 a Mannosidase III (Manin, wildtype gene sequence but not limited to: also Manlll with endoplasmic 
reticulum retention signal; Manlll with transmembrane fragment of early (cis-) Golgi apparatus 
glycosyltransferase (GnTI, ManI, GnTIII). The sequences encoding Mannosidase III can originate 
form insects, preferably from Spodoptera frugiperda or Drosophila melanogaster (but not limited 
to), human or from other organisms. 

1 5 in still another preferred embodiment, the present invention contemplates a host cell 

expressing TmXyl-GalT plus Manffl plus TrnGnTI-GnTH. In yet another preferred embodiment, 
the present invention contemplates a host cell expressing TmXyl-GalT plus Manlll plus TmManl- 
GnTI plus TmManl-GnTII. 

The method of the invention may optionally comprise, in one embodiment, introducing into 
20 said plant host system a mammalian N-acetylglucosaminyltransferase GnTIII, particularly a human 
GnTIII or hybrid protein comprising a catalytic portion of mammalian GnTIII and a transmembrane 
portion of a protein, said protein residing in the ER or earlier compartment of the Golgi apparatus of 
a eukaryotic cell. For example, in one embodiment, the hybrid enzyme TmXyl-GnTIH is 
contemplated (along with nucleic acid coding for such a hybrid enzyme, vectors containing such 
25 nucleic acid, host cells containing such vectors, and plants - or plant parts - containing such host 
cells). In another embodiment, the hybrid enzyme TmFuc-GnTIII is contemplated (along with 
nucleic acid coding for such a hybrid enzyme, vectors containing such nucleic acid, host cells 
containing such vectors, and plants - or plant parts - containing such host cells). The present 
invention specifically contemplates host cells expressing such hybrid enzymes (with or without 
30 additional hybrid enzymes or other glycosyltransferases). 

The invention is further directed to said hybrid and modified enzymes, nucleic acid 
sequences encoding said hybrid enzymes, vectors comprising said nucleic acid sequences and 
methods for obtaining said hybrid enzymes. Furthermore, the invention is directed to a plant host 
system comprising a heterologous glycoprotein having preferably complex type bi-antennary glycans 
35 and containing at least one galactose residue on at least one of the arms and are devoid of xylose and 
fucose. A "heterologous glycoprotein" is a glycoprotein originating from a species other than the 
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plant host system. The glycoprotein may include but is not limited to antibodies, hormones, growth 
factors and growth factor receptors and antigens. 

Indeed, the present invention is particularly useful for controlling the glycosylation of 
heterologous glycoproteins, such as antibodies or antibody fragments (single chain antibodies, Fab 

5 fragments, Fal^ fragments, Fv fragments, and the like). To control the glycosylation of an antibody, 
the gene construct encoding a hybrid enzyme of the present invention (e.g., the TmXyl-GalT gene 
construct) can be introduced in transgenic plants expressing an antibody (e.g., monoclonal antibody) 
or antibody fragment. On the other hand, the gene(s) encoding the antibody (or antibody fragment) 
can be introduced by retransformation of plant expressing TmXyl-GalT gene construct. In still 

1 0 another embodiment, the binary vector harbouring the TmXyl-GalT expression cassette can be co- 
transformed to plants together with a plant binary vector harbouring the expression cassettes 
comprising both light and heavy chain sequences of a monoclonal antibody on a single T-DNA or 
with binary vectors harbouring the expression cassettes for light and heavy chain sequences both 
separately on independent T-DNA's but both encoding a monoclonal antibody. The present 

15 invention specifically contemplates, in one embodiment, crossing plants expressing antibodies with 
plant expressing the hybrid glycosyltransferase(s) of the present invention. 

A "host system" may include but is not limited to any organism containing glycoproteins 
withN-glycans. 

A "plant host system" may include but is not limited to a plant or portion thereof, which 

20 includes but is not limited to a plant cell, plant organ and/or plant tissue. The plant may be a 

monocotyledon (monocot) which is a flowering plant whose embryos have one cotyledon or seed 
leaf and includes but is not limited to lilies, grasses, corn (Zea mays), rice, grains including oats, 
wheat and barley, orchids, irises, onions and palms. Alternatively, the plant may be a dicotyledenon 
(dicot) which includes but is not limited to tobacco (Nicotiana), tomatoes, potatoes, legumes (e.g, 

25 alfalfa and soybeans), roses, daises, cacti, violets and duckweed. The plant may also be a moss which 
includes but is not limited to Physcomitrella patens. 

The invention is further directed to a method for obtaining said plant host system. The 
method comprises crossing a plant expressing a heterologous glycoprotein with a 
plant comprising (a) a hybrid enzyme comprising a catalytic region (or portion thereof) of a 

30 galactosyltransferase not normally found in a plant and a CTS region (or fragment, such as one 
including the transmembrane domain) of a protein, wherein said protein acts earlier in the Golgi 
apparatus of a plant cell in said plant host system than said galactosyltransferase or a modified 
galactosyltransferase where its transmembrane portion has been deleted and endoplasmic reticulum 
retention signal has been inserted; (b) a hybrid enzyme comprising a CTS region (or portion thereof, 

35 such as one including the transmembrane domain) of a protein, particularly an enzyme, including but 
not limited to N-acetylglucosaminyltransferase I (GnTI) and a catalytic region (or portion thereof) of 
a mannosidase II (Manll), wherein said enzyme acts earlier in the Golgi apparatus of a plant cell in 
said plant host system than said mannosidase II or modified mannosidase II where its transmembrane 
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portion has been deleted and endoplasmic reticulum retention signal have been inserted and (c) a 
hybrid enzyme comprising at least a transmembrane region of an enzyme (such as the first 40-60 
amino acids of the N-terminus) of a glycosyltransferase including but not limited to N- 
acetylglucosaminyltransferase I (GnTI) and a catalytic region of a N-acetylglucos-aminyltransferase 

5 II (GnTII), wherein said enzyme acts earlier in the Golgi apparatus of a plant cell in said plant host 
system than said N acetylglucosaminyltransferase II (GnTII) or modified N-acetylglucosaminyl- 
transferase II (GnTII) where its transmembrane portion has been deleted and an endoplasmic 
reticulum retention signal have been inserted., harvesting progeny from said crossing and selecting a 
desired progeny plant expressing said heterologous glycoprotein. 

10 The invention is further directed to said plant or portion thereof which would constitute a 

plant host system. Said plant host system may further comprise a mammalian GnTIH enzyme or 
hybrid protein comprising a catalytic portion of mammalian GnTIII and a transmembrane portion of 
a protein, said protein residing in the ER or earlier compartment of the Golgi apparatus of a 
eukaryotic cell. 

1 5 Additionally, the invention also provides the use of a plant host system to produce a desired 

glycoprotein or functional fragment thereof. The invention additionally provides a method for 
obtaining a desired glycoprotein or functional fragment thereof comprising cultivating a plant 
according to the invention until said plant has reached a harvestable stage, for example when 
sufficient biomass has grown to allow profitable harvesting, followed by harvesting said plant with 

20 established techniques known in the art and fractionating said plant with established techniques 
known in the art to obtain fractionated plant material and at least partly isolating said glycoprotein 
from said fractionated plant material. 

Alternatively, said plant host cell system comprising said heterologous glycoprotein may 
also be obtained by introducing into a plant host cell system or portion thereof (a) a nucleic acid 

25 sequence encoding a hybrid enzyme comprising a catalytic region of a galactosyltransferase not 
normally found in a plant and at least the transmembrane region (or more of the CTS) of a protein, 
wherein said protein acts earlier in the Golgi apparatus of a plant cell in said plant host system than 
said galactosyltransferase or a modified galactosyltransferase where its transmembrane portion has 
been deleted and endoplasmic reticulum retention signal have been inserted; (b) a nucleic acid 

30 sequence encoding a first hybrid enzyme comprising at least the transmembrane region (or more of 
the CTS if desired) of a protein, particularly an enzyme, including but not limited to N- 
acetylglucosaminyltransferase I (GnTI) and a catalytic region of a mannosidase II (Manll) , wherein 
said enzyme acts earlier in the Golgi apparatus of a plant cell in said plant host system than said 
mannosidase II, or modified mannosidase II where its transmembrane portion has been deleted and 

35 endoplasmic reticulum retention signal have been inserted and (c) a nucleic acid sequence encoding 
a second hybrid enzyme comprising at least a transmembrane region (more of the CTS if desired) of 
an enzyme including but not limited to N-acetylglucosaminyl-transferase I (GnTI) and a catalytic 
region of a N-acetylglucosaminyltransferase II (GnTII), wherein said enzyme acts earlier in the 
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Golgi apparatus of a plant cell in said plant host system than said N- acetylglucos-aminyltransferase- 
II (GnTII) or modified N-acetylglucosaminyltransferase II (GnTII) where its transmembrane portion 
has been deleted and an endoplasmic reticulum retention signal have been inserted, and isolating a 
plant or portion thereof expressing said heterologous glycoprotein (or portion thereof). In one 

5 embodiment, one vector comprising all of the nucleic acid sequences is introduced into said plant 
host system. In another embodiment, each nucleic acid sequence is inserted into separate vectors 
and these vectors are introduced into said plant host system. In another embodiment combinations of 
two or more nucleic acid sequences are inserted into separate vectors which are than combined into 
said plant host system by retransformation or co-transformation or by crossing. 

10 The invention also provides use of such a plant-derived glycoprotein or functional fragment 

thereof according to the invention for the production of a composition, particularly, pharmaceutical 
composition, for example for the treatment of a patient with an antibody, a hormone, a vaccine 
antigen, an enzyme, or the like. Such a pharmaceutical composition comprising a glycoprotein or 
functional fragment thereof is now also provided. 

1 5 Finally, it is contemplated that the above-described approach may be useful in reducing the 

overall diversity in glycans in plants expressing one or more of the hybrid enzymes of the present 
invention (as compared to wild-type plants or plants simply transformed with only mammalian 
GalT). 

20 BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 compares the glycosylation pathway of glycoproteins in plants and in mammals. 
Figure 2 shows the effect of exchanging the CTS fragment of galactosyltransferase with 
xylosyltransferase 

Figure 3 shows the further effect of relocalizing mannosidase II and GlcNAcTIL 
25 Figure 4 top panel shows a T-DNA construct carrying the genes encoding glycan modifying 

enzymes to produce efficiently galactosylated glycans that are devoid of immunogenic xylose and 
fucose and the bottom panel shows a T-DNA construct carrying antibody light chain and heavy chain 
genes. 

Figure 5 shows the nucleic acid sequence (SEQ ID NO:l) for a human galactosyltransferase 
30 (human B 1 ,4-galactosyltransferase - GalT). 

Figure 6 shows the nucleic acid sequence of Figure 5 along with the corresponding amino 
acid sequence (SEQ ID NO:2). 

Figure 7 shows an illustrative mutated sequence (SEQ ID NO:59) derived the wild type 
amino acid sequence (SEQ ID NO:2) for a human galactosyltransferase, wherein a serine has been 
35 deleted from the cytoplasmic tail and a G-I-Y motif has been repeated. Of course, such changes are 
merely illustrative of the many possible changes within the scope of the present invention. For 
example, in one embodiment, the present invention contemplates mutated sequences wherein only 
deletions (one or more) are employed (e.g. deletions in the cytoplasmic tail domain or the stem 
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domain) - with no insertions or repeats. Similarly, in one embodiment, the present invention 
contemplates mutated sequences wherein only (one or more) insertions or replacements (e.g. in the 
transmembrane domain) are employed - with no deletions. 

Figure 8 shows the nucleic acid sequence (SEQ ID NO:3) encoding a hybrid enzyme 
comprising human galactosyltransferase (human Bl,4-galactosyltransferase - GalT). The upper case 
letters are nucleotides of Arabidopsis thaliana mKNA for beta 1,2-xylosyltransferase (database 
entry: EMBL:ATH277603, the TmXyl-fragment used involves nucleotides 135-297 of this database 
sequence). 

Figure 9 shows the nucleic acid sequence of Figure 8 along with the corresponding amino 
acid sequence (SEQ ID NO:4). 

Figure 10 shows the amino acid sequence (SEQ IDNO:4) for the hybrid enzyme encoded by 

the nucleic acid shown in Figure 8. 

Figure 1 1 shows the nucleic acid sequence (SEQ ID NO:5) for the human glycosyltransferase 
GnTm (along with additional sequence encoding a myc-tag) (primary accession number Q09327 
GNT3 HUMAN). 

Figure 12 shows the nucleic acid sequence of Figure 1 1 along with the corresponding amino 
acid sequence (SEQ ID NO:6). 

Figure 13 shows the amino acid sequence (SEQ ID NO:6) for a human GnTffl (along with 
additional amino acid sequence of the myc epitope tag SEQ ID NO:7). 

Figure 14 shows the nucleic acid sequence (SEQ ID NO:9) encoding one embodiment of a 
hybrid enzyme of the present invention, said hybrid enzyme comprising the transmembrane domain 
of a plant xylosyltransferase (TmXyl-) and the catalytic domain (along with other regions) for human 
GnTIII (TmXyl-GnTIII) (along with additional sequence encoding a myc-tag). 

Figure 15 shows the nucleic acid sequence of Figure 14 along with the corresponding amino 
acid sequence (SEQ ID NO:10). 

Figure 16 shows the amino acid sequence (SEQ ID NO: 10) for hybrid enzyme encoded by 
the nucleic acid of Figure 14 (along with additional sequence for the myc epitope tag SEQ ID NO:7). 

Figure 17 shows the complete nucleic acid sequence (SEQ ID NO:27) for a cassette encoding 
the hybrid enzymes TmXyl-GalT plus TmGnTI-GnTII plus TmGnTI-Manll). 

Figure 18 shows the complete nucleic acid sequence (SEQ ID NO:28) for a cassette encoding 
the hybrid enzyme TmGnTI-Manll (with the RbcSl promoter sequence SEQ ID NO:39 shown). 

Figure 19 shows the nucleic acid sequence (SEQ ID NO:29) encoding the hybrid enzyme 
TmGnTI-Manll. 

Figure 20 shows the nucleic acid sequence (SEQ ID NO:30) encoding the hybrid enzyme 
TmGnTI-GnTII. 

Figure 21 shows the nucleic acid sequence (SEQ ID NO:3 1) encoding the hybrid enzyme 
TmGnTI-GnTII, wherein the transmembrane fragment used (designated TmGntI) has the nucleic 
acid sequence set forth in SEQ ID NO:32. 



WO 03/078637 PCT/IB03/01626 

-25- 

Figure 22A shows the nucleic acid sequence (SEQ ID NO:32) encoding one embodiment of 
a transmembrane domain fragment (TmGnTI). Figure 22B shows the nucleic acid sequence (SEQ 
ID NO:33) encoding another embodiment of a transmembrane domain fragment (TmManI). 

Figure 23 shows the complete nucleic acid sequence (SEQ ID NO:34) for a triple cassette 
5 embodiment of the present invention. 

Figure 24 shows the nucleic acid sequence (SEQ ID NO:35) for a hybrid gene expression 
cassette (TmManl-GnTI). 

Figure 25 shows the nucleic acid sequence (SEQ ID NO:36) for the histone 3.1 promoter. 
Figure 26 shows the nucleic acid sequence (SEQ ID NO:37) for the hybrid gene fusion 
10 (TmManl-TmGnU). 

Figure 27 shows the nucleic acid sequence (SEQ ID NO:38) for the hybrid gene fusion 
TmManl-Manll (with the RbcSl promoter sequence SEQ ID NO:39 shown). 

Figure 28 shows the nucleic acid sequence (SEQ ID NO:39) for the RbcSl promoter. 
Figure 29 shows the nucleic acid sequence (SEQ ID NO:40) for the hybrid gene TmManl- 
1 5 Manll wherein the nucleic acid sequence (SEQ ID NO:33) encoding the transmembrane fragment is 
shown. 

Figure 30 shows the nucleic acid sequence (SEQ ID NO:41) for the hybrid gene TmManl- 

GnTIL 

Figure 31 shows the nucleic acid sequence (SEQ ID NO:42) for the Lhca promoter. 
20 Figure 32 shows the nucleic acid sequence (SEQ ID NO:43) for the hybrid gene TmManl- 

GnTII wherein the nucleic acid sequence (SEQ ID NO:33) encoding the transmembrane fragment is 
shown 

Figure 33 shows the nucleic acid sequence (SEQ ID NO:44) for the terminator sequence used 
(see below). 

25 Figure 34 is a Western Blot which examines total protein glycosylation of plants of the 

present invention compared to control plants. 

Figure 35 is a lectin blot with RCA on Fl progeny of crossed plants, said progeny made 
according to one embodiment of the present invention 

Figure 36 is a Western Blot. Panel A was assayed with anti-IgG antibody. Panel B was 
30 assayed with an anti-HRP antibody. Panel C was assayed with a specific anti-Xyl antibody fraction. 
Panel D was assayed with a specific anti-Fucose antibody fraction. Panel E was assayed with the 
lectin RCA. 

Figure 37 shows the nucleic acid sequence (SEQ ID NO:49) of a hybrid gene wherein the 
aminoterminal CTS region of an insect Mannosidase III gene is replaced by a mouse signal peptide 
35 and a carboxyterminal endoplasmic reticulum retention signal (KDEL) was added. 

Figure 38 shows the corresponding amino acid sequence (SEQ ID NO:50) for the nucleic 
acid sequence of Figure 37. 
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Figure 39 shows the nucleic acid sequence (SEQ ID NO:51) of a hybrid gene wherein the 
aminoterminal CTS region of a human beta-l,4-galactosyltransferase (GalT) gene is replaced by a 
mouse signal peptide and a carboxyterminal endoplasmic reticulum retention signal (KDEL) was 
added. 

5 Figure 40 shows the corresponding amino acid sequence (SEQ ID NO:52) for the nucleic 

acid sequence of Figure 39. 

Figure 41 shows the nucleic acid sequence (SEQ ID NO:53) of a hybrid gene wherein the 
aminoterminal CTS region of an Arabidopsis thaliana GnTI gene is replaced by a mouse signal 
peptide and a carboxyterminal endoplasmic reticulum retention signal (KDEL) was added. 
10 Figure 42 shows the corresponding amino acid sequence (SEQ ID NO:54) for the nucleic 

acid sequence of Figure 41. 

Figure 43 shows the nucleic acid sequence (SEQ ID NO:55) of a hybrid gene wherein the 
aminoterminal CTS region of an Arabidopsis thaliana GnTlI gene is replaced by a mouse signal 
peptide and a carboxyterminal endoplasmic reticulum retention signal (KDEL) was added. 
1 5 Figure 44 shows the corresponding amino acid sequence (SEQ ID NO:56) for the nucleic 

acid sequence of Figure 43. 

Figure 45 shows the nucleic acid sequence (SEQ ID NO:57) of a hybrid gene wherein the 
aminoterminal CTS region of a human beta-l,4-galactosyltransferase (GalT) gene is replaced by the 
CTS region of the human gene for GnTI. 
20 Figure 46 shows the corresponding amino acid sequence (SEQ ID NO:58) for the nucleic 

acid sequence of Figure 45. 

Figure 47 is a schematic of how enzymes might be localized to the Golgi. 
Figure 48 is a non-limiting speculative schematic of how the "swapping" of regions of 
transferases might cause relocalization. 

25 

DETAILED DESCRIPTION OF THE INVENTION 
Hybrid Enzymes 

The nucleic acid sequences encoding the various glycosylation enzymes such as 
30 mannosidases, GlcNAcTs, galactosyltransferases may be obtained using various recombinant DNA 
procedures known in the art, such as polymerase chain reaction (PCR) or screening of expression 
libraries to detect cloned DNA fragments with shared structural features. See, e.g., Innis et al. 9 1990, 
PCR: A Guide to Methods and Application, Academic Press, New York. Other nucleic acid 
amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) 
35 and nucleic acid sequence-based amplification (NASBA) or long range PCR may be used. 

Once the DNA fragments are generated, identification of the specific DNA fragment 
containing the desired gene may be accomplished in a number of ways. For example, if an amount of 
a portion of a gene or its specific RNA, or a fragment thereof, is available and can be purified and 
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labeled, the generated DNA fragments may be screened by nucleic acid hybridization to the labeled 
probe [Benton and Davis, Science 196:180 (1977); Grunstein and Hogness, Proc. Natl Acad. ScL 
U.SA. 72:3961 (1975)]. Alternatively, the presence of the gene may be detected by assays based on 
the physical, chemical, or immunological properties of its expressed product. For example, cDNA 
clones, or DNA clones which hybrid-select the proper mRNAs, can be selected which produce a 
protein that, e.g., has similar or identical electrophoretic migration, isoelectric focusing behavior, 
proteolytic digestion maps, or antigenic properties as known for the protein of interest. 

A nucleic acid sequence encoding a hybrid enzyme comprising a transmembrane portion of a 
first enzyme and a catalytic portion of a second enzyme may be obtained as follows. The sequence 
encoding the transmembrane portion is removed from the second enzyme, leaving a nucleic acid 
sequence comprising a nucleic acid sequence encoding the C-terminal portion of the second enzyme, 
which encompasses the catalytic site. The sequence encoding the transmembrane portion of the first 
enzyme is isolated or obtained via PCR and ligated to the sequence encoding a sequence comprising 
the C-terminal portion of the second enzyme. 

Modified Enzymes 

A nucleic acid sequence encoding a protein, particularly enzymes such as 
galactosyltransferases, mannosidases and N-acetylglucosamine transferases that are retained in the 
ER may be obtained by removing the sequence encoding the transmembrane fragment and 
substituting it for a methionine (initiation of translation) codon and by inserting between the last 
codon and the stop codon of galactosyltransferase the nucleic acid sequence encoding an ER 
retention signal such as the sequence encoding KDEL (amino acid residue sequence: lysine-aspartic 
acid-glutamic acid-leucine) [Rothman Cell 50:521 (1987)]. 

Using Domains and Portions Thereof 

As noted above, the phrases "at least a portion of" or a "fragment of refers to the minimal 
amino acid sequence necessary for a protein or a peptide to retain its natural or native function. For 
example, the function of an enzyme could refer to its enzymatic or catalytic role, its ability to anchor 
a protein in the Golgi apparatus, or as a signal peptide. Thus, the phrases "at least a portion of a 
transmembrane domain" or "a fragment of a transmembrane domain" each refer to the smallest 
amino acid segment of a larger transmembrane domain that still retains at least part of the native 
transmembrane functionality (for example, the function may be evident, albeit decreased). As 
another example, the phrases "at least a portion of a catalytic region" or "a fragment of a catalytic 
region" each refer to the smallest amino acid segment of a larger catalytic region that still retains at 
least part of the native catalytic functionality (again, even if somewhat decreased). As discussed 
herein, one skilled in the art will know the minimal amino acid segment that is necessary for a 
protein or a peptide to retain at least some of the functionality of the native protein or peptide. 
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The glycosyltransferase enzymes are typically grouped into families based on the type of 
sugar they transfer (galactosyltransferases, sialyltransferases, etc.). Based on amino-acid sequence 
similarity and the stereochemical course of the reaction, glycosyltransferases can be classified into at 
least 27-and perhaps as many as 47 different families [Campbell et al 9 Biochem. J. 326:929-939 

5 (1997), Biochem. J. 329:719 (1998)]. The majority of glycosyltransferases cloned to date are type II 
transmembrane proteins (i.e., single transmembrane domain with the NH 2 terminus in the cytosol 
and the COOH terminus in the lumen of the Golgi apparatus). Regardless of how they are classified, 
all glycosyltransferases share some common structural features: a short NH 2 -terminal cytoplasmic 
tail, a 16-20 amino acid signal-anchor or transmembrane domain, and an extended stem region 

10 which is followed by the large COOH-terminal catalytic domain. The cytoplasmic tail appears to be 
involved in the specific localization of some types of glycosyltransferases to the Golgi [Milland et 
al., J. Biol. Chem. 277:10374-10378]. The signal anchor domains can act as both uncleavable signal 
peptides and as membrane-spanning regions that orient the catalytic domains of the 
glycosyltransferases within the lumen of the Golgi apparatus. 

15 In one embodiment of the present invention, a portion defined by the N-terminal 77 amino 

acids of Nicotiana benthamiana (tobacco) acetylglucosaminyltransferase I are contemplated for use 
in the hybrid enzyme(s), since this portion has been found to be sufficient to target to and to retain a 
reporter protein in the plant Golgi apparatus [Essl et al. y FEBS Lett 453:169-173 (1999)]. 
Subcellular localization in tobacco of various fusion proteins between the putative cytoplasmic, 

20 transmembrane and stem domains revealed that the cytoplasmic-transmembrane domains alone were 
sufficient to sustain Golgi retention of p 1,2-xylosyltransferase without the contribution of any 
luminal sequences [Dirnberger et a/., Plant Mol. Biol 50:273-281 (2002)]. Thus, as noted above, 
certain embodiments of the present invention utilize portions of the CTS region which involve only 
the cytoplasmic-transmembrane domains (or portions thereof) without utilizing the stem region of 

25 the CTS region. However, while some types of glycosyltransferases rely primarily on their 

transmembrane domain for Golgi retention, other types require their transmembrane region and 
sequences flanking one or both sides of this region [Colley, Glycobiology 7:1-13 (1997)]. For 
example, the N-terminal peptide encompassing amino acids 1 to 32 appears to be the minimal 
targeting signal sufficient to localize p 1,6 N-acetylglucosaminyltransferase to the Golgi. This 

30 peptide makes up the cytoplasmic and transmembrane domains of this enzyme [Zerfaoui et al. 9 
Glycobiology 12:15-24]. 

A great deal of information is available on the amino acid sequences of the domains for 
specific glycosyltransferases. For example, the amino acid sequence of the mammalian 
galactosyltransferase provided in GenBank Accession No. AAM17731 has the "stem" and 

35 "catalytic" domains spanning residues 1 9 to 147 and residues 148 to 397, respectively [U.S. Patent 
No. 6,416,988, hereby incorporated by reference] - and the present invention, in certain 
embodiments, specifically contemplates such portions for use in the hybrid enzyme(s). The amino 
acid sequence of the rat liver sialyltransferase provided in GenBank Accession No. AAC91 156 has a 
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9-amino acid NH 2 -terminal cytoplasmic tail, a 17-amino acid signal-anchor domain, and a luminal 
domain that includes an exposed stem region followed by a 41 kDa catalytic domain [Hudgin et al, 
Can. J. Biochem. 49:829-837 (1971); U.S. Patent Nos. 5,032,519 and 5,776,772, hereby incorporated 
by reference]. Known human and mouse P 1,3-galactosyltransferases have a catalytic domain with 
eight conserved regions [Kolbinger et al, J. Biol Chem. 273:433-440 (1998); Hennet et al, J. Biol 
Chem. 273:58-65 (1998); U.S. Patent No. 5,955,282, hereby incorporated by reference]. For 
example, the amino acid sequence of mouse UDP-galactose: p-N-acetylglucosamine p 1,3- 
galactosyltransferase-I provided in GenBank Accession No. NM020026 has the following catalytic 
regions: region 1 from residues 78-83; region 2 from residues 93-102; region 3 from residues 116- 
1 19; region 4 from residues 147-158; region 5 from residues 172-183; region 6 from residues 203- 
206; region 7 from amino acid residues 236-246; and region 8 from residues 264-275. [Hennet et al, 
supra.] - all of which are contemplated in certain embodiments of the present invention as useful 
portions in the context of the hybrid enzyme(s) discussed above. 

While earlier comparisons amongst known cDNA clones of glycosyltransferases had 
revealed very little sequence homology between the enzymes [Paulson et al, J. Biol. Chem. 
264:17615-618 (1989)], more recent advances have made it possible to deduce conserved domain 
structures in glycosyltransferases of diverse specificity [Kapitonov et al, Glycobiology 9:961-978 
(1999)]. For example, the nucleic acid and amino acid sequences of a number of 
glycosyltransferases have been identified using sequence data provided by the complete genomic 
sequences obtained for such diverse organisms as Homo sapiens (humans), Caenorhabditis elegans 
(soil nematode), Arabidopsis thaliana (thale cress, a mustard) and Oryza sativa (rice). 

As a result of extensive studies, common amino acid sequences have been deduced for 
homologous binding sites of various families of glycosyltransferases. For example, sialyltransferases 
have sialyl motifs that appear to participate in the recognition of the donor substrate, CMP-sialic acid 
[Paulson et al, J. Biol Chem., 264:17615-17618 (1989); Datta et al, J. Biol Chem., 270:1497-1500 
(1995); Katsutoshi, Trends Glycoscl Glycotech. 8:195-215 (1996)]. The hexapeptide RDKKND in 
Gal al-3 galactosyltransferase and RDKKNE in GlcNAc pi-4 galactosyltransferase have been 
suggested as the binding site for UDP-Gal [(Joziasse et al, J. Biol Chem., 260:4941-4951 (1985), J. 
Biol Chem., 264:14290-14297 (1989); Joziasse, Glycobiology, 2X11-271 (1992)]. 

A small, highly-conserved motif formed by two aspartic acid residues (DXD), which is 
frequently surrounded by a hydrophobic region, has been identified in a large number of different 
eukaryotic transferases, including a-1, 3-mannosyltransferase, P-l, 4-galactosyltransfereases, a-1, 3- 
galactosyltransferases, glucuronyltransferases, fucosyltransferases, glycogenins and others [Wiggins 
et al, Proc. Natl Acad. Sci. U.S.A. 95:7945-7950 (1998)]. Mutation studies indicate that this motif 
is necessary for enzymatic activity [Busch et al, J. Biol Chem. 273:19566-19572 (1998); Wang et 
al, J. Biol Chem. 277:18568-18573 (2002)]. Multiple peptide alignment showed several motifs 
corresponding to putative catalytic domains that are conserved throughout all members of the P 3- 



WO 03/078637 PCT/IB03/01626 

-30- 

galactosyltransferase family, namely, a type II transmembrane domain, a conserved DxD motif, an 
N-glycosylation site and five conserved cysteines [Gromova et al, Mol Carcinog. 32:61-72 (2001)]. 

Through the use of BLAST searches and multiple alignments, the E-X 7 -E motif was found to 
be a highly conserved among the members of four families of retaining glycosyltransferases [Cid et 

5 al, J. Biol Chem. 275:33614-33621 (2000)]. The O-linked acetylglucosaminyltransferases 
(GlcNAc) add a single p-N-acetylglucosamine moiety to specific serine or threonine hydroxyls. 
BLAST analyses, consensus secondary structure predictions and fold recognition studies indicate 
that a conserved motif in the second Rossmann domain points to the UDP-GlcNAc donor-binding 
site [Wrabl et al, J. Mol Biol 314:365-374 (2001)]. The pi, 3-glycosyltransferase enzymes 

10 identified to date share several conserved regions and conserved cysteine residues, all being located 
in the putative catalytic domain. Site-directed mutagenesis of the murine p3GatT-I gene (Accession 
No. AF029790) indicate that the conserved residues W101 and W162 are involved in the binding of 
the UDP-galactose donor, the residue W3 15 in the binding of the N-acetylglucosamine- p-p- 
nitrophenol acceptor, and the domain including E264 appears to participate in the binding of both 

15 substrates [Malissard et al, Eur. J. Biochem. 269:233-239 (2002)]. 

Expression of Proteins of Interest in Plant Host System 

The nucleic acid encoding the hybrid or modified enzymes or other heterologous proteins, 
such as a heterologous glycoprotein may be inserted according to certain embodiments of the present 

20 invention into an appropriate expression vector, i.e., a vector which contains the necessary elements 
for the transcription and translation of the inserted coding sequence, or in the case of an RNA viral 
vector, the necessary elements for replication and translation, as well as selectable markers. These 
include but are not limited to a promoter region, a signal sequence, 5' untranslated sequences, 
initiation codon (depending upon whether or not the structural gene comes equipped with one), and 

25 transcription and translation termination sequences. Methods for obtaining such vectors are known in 
the art (see WO 01/29242 for review). 

Promoter sequences suitable for expression in plants are described in the art, e.g., WO 
91/198696. These include non-constitutive promoters or constitutive promoters , such as, the 
nopaline synthetase and octopine synthetase promoters, cauliflower mosaic virus (CaMV) I9S and 

30 35S promoters and the figwort mosaic virus (FMV) 35 promoter (see U.S. Pat. Nos. 5, 352,605 and 
6,051,753, both of which are hereby incorporated by reference). Promoters used may also be tissue 
specific promoters targeted for example to the endosperm, aleurone layer, embryo, pericarp, stem, 
leaves, tubers, roots, and the like. 

A signal sequence allows processing and translocation of a protein where appropriate. The 

35 signal can be derived from plants or could be non-plant signal sequences. The signal peptides direct 
the nascent polypeptide to the endoplasmic reticulum, where the polypeptide subsequently undergoes 
post-translational modification. Signal peptides can routinely be identified by those of skill in the art. 
They typically have a tripartite structure, with positively charged amino acids at the N-terminal end, 
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followed by a hydrophobic region and then the cleavage site within a region of reduced 
hydrophobicity. 

The transcription termination is routinely at the opposite end from the transcription initiation 
regulatory region. It may be associated with the transcriptional initiation region or from a different 

5 gene and may be selected to enhance expression. An example is the NOS terminator from 

Agrobacterium Ti plasmid and the rice alpha-amylase terminator. Polyadenylation tails may also be 
added. Examples include but are not limited to Agrobacterium octopine synthetase signal, [Gielen et 
al, EMBOJ. 3:835-846 (1984)] or nopaline synthase of the same species pepickere/a/.,M>/. 
Appl Genet 1:561-573 (1982)]. 

1 0 Enhancers may be included to increase and/or maximize transcription of the heterologous 

protein. These include, but are not limited to peptide export signal sequence, codon usage, introns, 
polyadenylation, and transcription termination sites ( see WO 01/29242). 

Markers include preferably prokaryote selectable markers. Such markers include resistance 
toward antibiotics such as ampicillin, tetracycline, kanamycin, and spectinomycin. Specific 

15 examples include but are not limited to streptomycin phosphotransferase (spt) gene coding for 

streptomycin resistance, neomycin phosphotransferase (nptll) gene encoding kanamycin or geneticin 
resistance, hygromycin phosphotransferase (hpt) gene encoding resistance to hygromycin. 

The vectors constructed may be introduced into the plant host system using procedures 
known in the art (reviewed in WO 01/29242 and WO 01/31045). The vectors may be modified to 

20 intermediate plant transformation plasmids that contain a region of homology to an Agrobacterium 
tumefaciens vector, a T-DNA border region from A. tumefaciens. Alternatively, the vectors used in 
the methods of the present invention may be Agrobacterium vectors. Methods for introducing the 
vectors include but are not limited to microinjection, velocity ballistic penetration by small particles 
with the nucleic acid either within the matrix of small beads or particles, or on the surface and 

25 electroporation. The vector may be introduced into a plant cell, tissue or organ. In a specific 

embodiment, once the presence of a heterologous gene is ascertained, a plant may be regenerated 
using procedures known in the art. The presence of desired proteins may be screened using methods 
known in the art, preferably using screening assays where the biologically active site is detected in 
such a way as to produce a detectable signal. This signal may be produced directly or indirectly. 

30 Examples of such assays include ELISA or a radioimmunoassay. 

Transient Expression 

The present invention specifically contemplates both stable and transient expression of the 
above-described hybrid enzymes. Techniques for transforming a wide variety of higher plant species 
35 for transient expression of an expression cassette are well known [see, for example, Weising et al t 
Ann. Rev. Genet. 22:421-477(1988)]. Variables of different systems include type nucleic acid 
transferred (DNA, RNA, plasmid, viral), type of tissue transformed, means of introducing 
transgene(s), and conditions of transformation. For example, a nucleic acid construct may be 
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introduced directly into a plant cell using techniques ranging from electroporation, PEG poration, 
particle bombardment, silicon fiber delivery, microinjection of plant cell protoplasts or embryogenic 
callus or other plant tissue, or Agrobacterium-mediated transformation [Hiei et aL, Plant J. 6:271- 
282 (1994)]. Because transformation efficiencies are variable, internal standards (eg, 35S-Luc) are 
often used to standardize transformation efficiencies. 

Expression constructs for transient assays include plasmids and viral vectors. A variety of 
plant viruses that can be employed as vectors are known in the art and include cauliflower mosaic 
virus (CaMV), geminivirus, brome mosaic virus, and tobacco mosaic virus. 

Plant tissues suitable for transient expression include cultured cells, either intact or as 
protoplasts (in which the cell wall is removed), cultured tissue, cultured plants, and plant tissue such 
as leaves. 

Some transient expression methods utilize gene transfer into plant cell protoplasts mediated 
by electroporation or polyethylene glycol (PEG). These methods require the preparation and culture 
of plant protoplasts, and involve creating pores in the protoplast through which nucleic acid is 
transferred into the interior of the protoplast. 

Exemplary electroporation techniques are described in Fromm et aL 9 Proc. Natl Acad, Set. 
82: 5824 (1985). The introduction of DNA constructs using polyethylene glycol precipitation is 
described in Paszkowski et al 9 EMBO J. 3: 2717-2722 (1984). PEG-mediated transformation of 
tobacco protoplasts, which includes the steps of isolation, purification, and transformation of the 
protoplasts, are described in Lyck et aL, (1997) Planta 202: 1 17-125 and Scharf et aL, (1998) Mol 
Cell Biol 18: 2240-225 1, and Kirschner et al, (2000) The Plant J 24(3): 397-41 1. These methods 
have been used, for example, to identify cis-acting elements in promoters activated by external 
stimuli, Abel and Theologis (1994) Plant JS: 421-427; Hattori et aL, (1992) Genes Dev 6: 609- 
618;Sablowskiera/., (1994) EMBOJ 13: 128-137; and Solano etal, (1995) EMBO J 14: 1773- 
1784), as well as for other gene expression studies (U. S. Patent 6,376,747, hereby incorporated by 
reference). 

Ballistic transformation techniques are described in Klein et al, (1987) Nature 327: 70-73. 
Biolistic transient transformation is used with suspension cells or plant organs. For example, it has 
been developed for use inNicotiana tabacum leaves, Godon et al (1993) Biochimie 75(7): 591-595. 
It has also been used in investigating plant promoters, (Baum et aL, (1997) Plant J 12: 463-469; 
Stromvik et aL, (1999) Plant Mol Biol 41(2): 217-31, Tuerck and Fromm (1994) Plant Cell 6: 
1655-1663; and U. S. Patent 5,847,102, hereby incorporated by reference), and to characterize 
transcription factors (Goff et aL, (1990) EMBO J 9: 2517-2522; Gubler et aL, (1999) Plant J 17: 1- 
9; and Sainz et aL, (1997) Plant Cell 9: 61 1-625). 

Other methods allow visualization of transient expression of genes in situ, such as with onion 
epidermal peels, in which GFP expression in various cellular compartments was observed (Scott et 
aL, (1999) Biotechniques 26(6): 1128-1132 
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Nucleic acids can also be introduced into plants by direct injection. Transient gene 
expression can be obtained by injection of the DNA into reproductive organs of a plant (see, for 
example, Pena et al, (1987) Nature, 325.:274), such as by direct DNA transfer into pollen (see, for 
example, Zhou et al, (1983) Methods in Enzymology, 101 :433; D. Hess (1987) Intern Rev. CytoL, 
5 107:367; Luo et al, (1988) Plant Mol. Biol Reporter,*: 1 65. DNA can also be injected directly into 
the cells of immature embryos (see, for example, Neuhaus et al, (1987) Theor. Appl Genet: 75:30; 
and Benbrook et al, (1986) in Proceedings Bio Expo 1986, Butterworth, Stoneham, Mass., pp. 
27-54). 

Agrobacterium-mediated transformation is applicable to both dicots and monocots. 

1 0 Optimized methods and vectors for Agrobacterium-mediated transformation of plants in the family 
Graminae, such as rice and maize have been described (see, for example, Heath et al, (1997) Mol 
Plant-Microbe Interact. 10:221-227; Hiei etal, (1994) Plant J. 6:271-282 and Ishida et al, (1996) 
Nat Biotech. 14:745-750). The efficiency of maize transformation is affected by a variety of factors 
including the types and stages of tissue infected, the concentration of Agrobacterium, the tissue 

1 5 culture media, the Ti vectors and the maize genotype. 

Another useful basic transformation protocol involves a combination of wounding by particle 
bombardment, followed by use of Agrobacterium for DNA delivery (see, for example, Bidney et al, 
(1992) Plant Mol Biol 18:301-313). Both intact meristem transformation and a split meristem 
transformation methods are also known (U. S. Patent 6,300,545, hereby incorporated by reference). 

20 Additional methods utilizing Agrobacteria include agroinfection and agroinfiltration. By 

inserting a viral genome into the T-DNA, Agrobacterium can be used to mediate the viral infection 
of plants (see, for example, U. S. Patent 6,300,545, hereby incorporated by reference). Following 
transfer of the T-DNA to the plant cell, excision of the viral genome from the T-DNA (mobilization) 
is required for successful viral infection. This Agrobacterium-mediated method for introducing a 

25 virus into a plant host is known as agroinfection (see, for example, Grimsley, "Agroinfection' 1 pp. 
325-342, in Methods in Molecular Biology, vol 44: Agrobacterium Protocols, ed. Gartland and 
Davey, Humana Press, Inc., Totowa, N.J.; and Grimsley (1990) Physiol Plant. 79:147-153). 

The development of plant virus gene vectors for expression of foreign genes in plants 
provides a means to provide high levels of gene expression within a short time. 

30 Suitable viral replicons include double-stranded DNA from a virus having a double stranded DNA 
genome or replication intermediate. The excised viral DNA is capable of acting as a replicon or 
replication intermediate, either independently, or with factors supplied in trans. The viral DNA may 
or may not encode infectious viral particles and furthermore may contain insertions, deletions, 
substitutions, rearrangements or other modifications. The viral DNA may contain heterologous 

35 DNA, which is any non-viral DNA or DNA from a different virus. For example, the heterologous 
DNA may comprise an expression cassette for a protein or RNA of interest. 

Super binary vectors carrying the vir genes of Agrobacterium strains A281 and A348 are 
useful for high efficiency transformation of monocots. However, even without the use of high 
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efficiency vectors, it has been demonstrated that T-DNA is transferred to maize at an efficiency that 
results in systemic infection by viruses introduced by agroinfection, although tumors are not formed 
(Grimsley et al, (1989) Mol Gen. Genet 217:309-316). This is because integration of the T-DNA 
containing the viral genome is not required for viral multiplication, since the excised viral genome 
acts as an independent replicon. 

Another Agrobacteria-mediated transient expression assay is based on Agrobacterium- 
mediated transformation of tobacco leaves in planta (Y ang et al, (2000) The Plant J 22(6): 543- 
551). The method utilizes infiltration of agrobacteria carrying plasmid constructs into tobacco 
leaves, and is referred to as agroinfiltration; it has been utilized used to analyze in vivo expression of 
promoters and transcription factors in as little as 2-3 days. It also allows examination of effects of 
external stimuli such as pathogen infections and environmental stresses on promoter activity in situ. 

Example 1 

An Arabidopsis thaliana cDNA encoding pentosyltransferase was isolated from a cDNA 
library by a previously described PCR based sibling selection procedure 
[Bakker et al, BBRC 261 :829 (1999)]. Xylosyltransferase activity was confirmed by 
immunostaining of transfected CHO cells with a xylose specific antibody purified from rabbit-anti- 
horseradish-peroxidase antiserum. A DNA fragment covering the N-terminal part of the 
xylosyltransferase was amplified using primers: 

XylTpvuF:ATACTCGAGTTAACAATGAGTAAACGGAATC (SEQ IDNO:45) 
and XylTpvuRrTTCTCGATCGCCGATTGGTTATTC (SEQ ID NO:46) 
Xhol and Hpal restriction sites were introduced in front of the start codon and a Pvul was introduced 
at the reverse end. A C-terminal fragment from Human pl,4galactosyltransferase (acc.no. x55415, 
Aoki 1992) was amplified using primers GalTpvuF:GCCGCCGCGATCGGGCAGTCCTCC (SEQ 
ID NO:47) and GalTrev:AACGGATCCACGCTAGCTCGGTGTCCCGAT (SEQ ID NO:48) thus 
introducing Pvul and BamHI sites. The Xhol/Pvul and PvuI/BamHI digested PCR fragments were 
ligated in XhoI/BamHI digested pBluescriptSK+ and sequenced. The resulting open reading frame 
encodes a fusion protein containing the first 54 amino acids of A. thaliana pentosyltransferase 
fused with amino acid 69 to 398 of human pl,4galactosyltransferase and is designated as TmXyl- 
GalT. The fragment was cloned into a plant expression vector between the CaMV35S promoter and 
Nos terminator, using Hpal/BamHI. The clone was introduced into Nicotiana tabacum (samsun NN) 
as described for native human pl,4galactosyltransferase [Bakker et al, Proc. Nat Acad. Sci. USA 
98:2899 (2001)]. 

Protein extract of transgenic plants and Western Blots were made as described [Bakker et al, 
Proc. Nat. Acad. ScL USA 98:2899 (2001)]. Based on reaction with the lectin RCA, a transgenic 
plant expressing TmXylGalT was selected for further glycan analysis by MALDI-TOF [Elbers et aL, 
Plant Physiology 126:1314 (2001] and compared with glycans isolated from plants expressing native 
pl,4galactosyltransferase and with glycans from wild-type plants. Relative peak areas of the 
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MALDI-TOF spectrum are given in Table 1 . That is to say, Table 1 is a comparison of the results of 
mass spec (MALDI-TOF) analysis of N-glycans of endogenous glycoproteins of control tobacco 
("Tobacco"), transgenic tobacco expressing human beta-l,4«galactosyltransferase ("GalT") and 
transgenic tobacco plants expressing the beta-l,4-galactosyltransferase gene of which the CTS 
5 region has been replaced with that of beta-1 ,2-xylosyltransferase ("TmXyl-GalT"). 
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show that: 

In TmXylGalT plants, xylosylation and fucosylation of the glycans is dramatically 
reduced: 82% of the glycans do not carry xylose nor fucose as compared to 14% in wild- 
type plants. 

Galactosylation has increased from 9% in GalT plants to 32% in TmXylGalT plants. 



These data 
1. 

10 

2. 
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Example 2 

A transgenic plant expressing said TmXyl-GalT gene (TmXyl-GalT-12 plant) was selected 
(above) based on lectin blotting using biotin-labelled RCA (Vector Laboratories, Burlingame, 
California). Comparison of protein extracts of MGR48 transgenic (control) plant, a selected 
transgenic plant expressing the unmodified human pl,4-galactosyltransferase gene and TmXyl- 
GalT-12 plant for the presence of xylose and fiicose using anti-HRP (horseradish peroxidase) 
polyclonal antibody (known for high anti-xylose and anti-fucose reactivity) clearly showed reduced 
xylose and fucose (Figure 34: "Anti-HRP"). Western blotting using an anti-xylose fraction of the 
anti-HRP and an anti-fucose fraction (each of which can be prepared by affinity chromatography 
over the appropriate ligand) showed that especially xylose was reduced compared to control plants 
(Figure 34: anti-Fuc" and "anti-Xyl"). 

Example 3 

The TmXyl-GalT-12 plant was crossed with a transgenic plant expressing the monoclonal 
antibody MGR48 from a single T-DNA integration event (MGR48-31) and which was first made 
homozygous by selecting offspring plants not segregating for the kanamycin resistance marker and 
antibody production (MGR48-31-4). Pollen of MGR48-31-4 was used for pollination of emasculated 
TmXyl-GalT-12 plants. Vice versa, pollen of TmXyl-GalT-12 plant was used for fertilization on 
emasculated MGR48-31-4 plants. A number of Fl plants were analyzed for the presence of MGR48 
by western blotting and for galactosylation of endogenous glycoproteins by lectin blotting using 
RCA (Figure 35). One plant expressing MGR48 and showing galactosylation of endogenous 
glycoproteins was selected for further analysis. This plant was identified as XGM8. 

Seeds from TmXyl-GalT-12 (?) x MGR48-31-4 (<?) were sown and Fl offspring plants 
(XGM) were analysed for antibody production by Western blotting and for galactosylation by lectin 
blotting using biotinylated RCA120 (Vector Labs., Burlingame, California) using standard 
techniques as described before. All plants as expected expressed the monoclonal antibody MGR48 
and the majority also had galactosylated glycans as depicted from lectin blotting using RCA120. A 
single plant expressing both antibody MGR48 and having galactosylated N-glycans was chosen for 
further analysis (XGM8) (TmXyl-GalT-12 X MGR48-31-4 oflpring plant 8). The monoclonal 
recombinant MGR48 antibody was purified from this plant as described before and submitted to N- 
glycan analysis by MALDI-TOF. 

Briefly, XGM8 plant was grown in greenhouse for antibody production under optimal 
conditions [Elbers et aU Plant Physiology 126:1314 (2001)]. Protein extract of leaves of transgenic 
XGM8 plant was made and monoclonal antibody was purified using protein G chromatography as 
described [Bakker et al, Proc. Nat Acad Set USA 98:2899 (2001)]. MALDI-TOF of N-glycans of 
purified monoclonal antibody was as described (Elbers et ai, 2001, supra). The presence of 
galactose on glycans was established by enzyme sequencing using bovine testis p-galactosidase as 
described (Bakker et al 9 2001, supra; Table 2). Table 2 (below) is a comparison of the results of 
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mass spec (MALDI-TOF) analysis of N-glycans of endogenous glycoproteins ("Xyl-GalT Endo") of 
a Fl hybrid of TmXyl-GalT-12plant and plant producing rec-mAb (MGR48) and of N-glycans of 
rec-mAB purified by protein G chromatography from said Fl hybrid. 
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These data show that: 

1 . In the Fl hybrid, xylosylation and fucosylation of the glycans is dramatically reduced: 43% of 
the glycans of endogenous glycoproteins lack xylose and fucose as compared to only 14% in 

5 wild-type tobacco plants, 

2. The glycans of purified mAb of this Fl hybrid have reduced xylose and fucose, 47% compared to 
14% for wildtype tobacco. See also Figure 36, panels B-D. 

3. Galactosylation of endogenous glycoproteins of Fl hybrid has increased from 9% in GalT plants 
to 37% in Fl TmXyl-GalT X MGR48 plant. See also Figure 35. 

10 4. Purified rec-mAb from said Fl (see Figure 36, panel A) shows increased galactosylation; that is 
to say, 46% has galactose. See also Figure 36, panel E. 
It should however be noted that the observed quantities (MALDI-TOF) do not necessarily reflect the 
molar rations of said glycoforms in vivo. Quantification based on MALDI-TOF can be under- or 
overestimated depending on the specific glycoform under study. Also, since there is no molecular 

1 5 weight difference between Gal and Man, some peaks can not be annotated unambiguously unless 
there are clear differences in relative height of specific molecules before and after galactosidase 
treatment. 

Example 4 

20 A more direct comparison of xylose, fucose and galactose content was done by examining 

the MGR48 IgG antibodies from hybridoma, transgenic tobacco and TmXyl-GalT transgenic 
tobacco. As mentioned above, the TmXyl-GalT-12 plant was crossed with tobacco plant expressing 
MGR48 IgG (MGR48 tobacco) resulting in an Fl hybrid harbouring MGR48 TmXyl-GalT. An Fl 
plant was chosen for extraction and purification of MGR48 IgG. Antibodies from said plants 

25 (tobacco and TmXyl-GalT) were isolated and purified using protein G chromatography (Elbers et 
oi, 2001. Plant Physiology 126: 1314-1322). 300 nanograms amounts of each, hybridoma MGR48 
and plant-derived recMGR48, were loaded on precast 12% SDS-PAGE gels (BioRad) and run. The 
contents of each lane were as follows: Lane 1, MGR48 from hybridoma; Lane 2, purified recMGR48 
from normal transgenic tobacco plant; and Lane 3, purified recMGR48 from TmXyl-GalT transgenic 

30 plant. Following SDS-PAGE proteins were transferred to nitrocellulose using CAPS buifer. Blots 
were incubated with A, anti-mouse IgG; B, polyclonal rabbit anti-HRP (anti-xylose/(alpha 1,3- 
focose); C, anti-xylose; D, anti-(alpha 1,3-) fucose antibodies; and E, biotinylated RCA. Detection 
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was with LumiLight on Lumi Imager following incubation with HRP-labelled sheep anti-mouse 
(panel A) or goat-anti-rabbit (panels B-D) antibodies and HRP-labeled streptavidin (E). 

Panel A shows that approximately similar amounts of the MGR48 IgG was loaded for all 
lanes (1-3). L refers to Light chain and H, heavy chain of MGR48 IgG. 

5 Panel B shows that the heavy chain of MGR48 antibody in lane 2 (tobacco) strongly reacts with anti- 
HRP as expected, whereas the heavy chain of hybridoma derived MGR48 (lane 1) does not (as 
expected). Hybridoma derived antibodies do not carry xylose and alpha 1, 3-fuctose residues. 
Remarkably, MGR48 antibodies from TmXyl-GalT tobacco plant also do not react, suggesting that 
the heavy chain of antibody from this plant have significantly reduced (perhaps by 90% or more) the 

1 0 amounts of xylose and fiicose residues on the N-glycans. This is confirmed by experiments depicted 
in panels C (anti-xylose) and D (anti-fucose). Panel E shows that the heavy chain of MGR48 
antibody of hybridoma (lane 1) has a galactosylatedN-glycan, whereas tobacco-derived MGR48 
(lane 2) has not, both as expected. Heavy chain of MGR48 from the TmXyl-GalT plant (lane 3) also 
has galactosylated N-glycan due to the presence of the construct expressing the hybrid enzyme. 

1 5 These data are in agreement with the data obtained from similar experiments using total 

protein extracts from similar plants (tobacco and TmXyl-GalT-12 plant) as shown previously and 
confirm that the novel trait introduced in tobacco from expression of TmXyl-GalT gene can be 
stably transmitted to offspring and a recombinant monoclonal antibody. 

20 Example 5 

Further characterization of the above-described Fl hybrid was performed by treatement with 
beta-gaiactosidase. Table 3 is a comparison of the results of mass spec (MALDI-TOF) analysis of 
N-glycans of rec-mAbs purified by protein G chromatography from an Fl hybrid of TmXyl-GalT 
and MGR48 plant before and after treatment of the glycans with beta-galactosidase. 



TABLE 3 






Xyl-GalT 
IgG- 


Xyl-GalT 
IgG+beta-galactosidase 


m/z 




Type 


933 




M3 


4 


4 


1065 




XM3 


2 


2 


1079 




FM3 


3 i 


3 


1095 




M4 


5 


4 


1136 




GNM3 


2 


3 


1211 




FXM3 


3 


4 


1241 




FM4 


2 


2 


1257 




M5 


12 


13 


1268 




GNXM3 


2 


3 


1282 




GNFM3 


3 


3 


1298 




GalGNM3 


4 


4 


1403 




FM5 


3 


2 


1414 




GNFXM3 


4 


5 


1419 




M6 


4 


3 
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1430 




GNXM4 


2 


2 


1430 




GalGNXM3 






1444 




GNFM4 


3 


3 


1444 




GalGNFM3 






1460 




GalGNM4 


10 


14 


1460 




GNM5 






1471 




GN2XM3 




1 


14R5 




GN2FM3 


1 


1 


1501 




GalGN2M3 


1 




1^76 




GalGNFXM3 


3 


3 


1576 




GNFXM4 






1581 




M7 


2 


2 


1RQ3 




GalGNXM4 


2 


2 


15Q3 




GNXM5 






1606 

I QUO 




GNFM5 


4 


6 


1606 




GalGNFM4 






1617 
ID I / 




GN2FXM3 


1 


1 






GalGNM5 


6 


1 


1699 




GNM6 






1647 




GalGN2FM3 


1 




IDDO 






1 




1738 




GNFXM5 


2 


2 


1736 




GalGNFXM4 






1743 




M8 


2 


2 


1754 




GalGNXM5 


2 


1 


1768 




GalGNFM5 


3 


1 


1768 

1 i oo 




GNFM6 






1784 




GNM7 


1 


1 


1784 




GalGNM6 






1809 




Gal2GN2FM3 


1 




1900 




GNFXM6 




1 


1900 




GalGNFXM5 






1905 




M9 


1 


1 
















TOTAL 


102 


100 



These data show that: 

1 . Rec-mAbs from Fl hybrid contain galactose which can be deduced from the observed reduction 
of specific (galactose-containing) glycoforms after beta-galactosidase treatment and increase of 

5 glycoforms lacking galactose. Note the observed reduction of m/z 1622 from 6 to 1% and 

simultaneous increase of m/z 1460 from 1 0 to 14% which is the result of the removal of 
galactose from GalGNMS to give rise to GNM5. The same is true for m/z 1768 (3 to 1% 
decrease) and corresponding m/z 1606 peak (4 to 6% increase). See also Figure 36, panel E. 

2. Similarly a number of peaks that can be attributed to galactose containing glycans vanish upon 
1 0 treatment with galactosidase, especially m/z 1 501 , 1 647 and 1 663 confirming the presence of 

galactose. 
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Example 6 

In another embodiment, the aminoterminal CTS region of an insect Mannosidase III gene 
(accession number: AF005034; mistakenly annotated as a Mannosidase II gene!) is replaced by a 
mouse signal peptide coding sequence for import into the endoplasmic reticulum (see Figure 37) . , 
The signal peptide sequence encodes a fully active signal peptide normally present at the 
aminoterminus of IgG sequences and has been used successfully in plants and other organisms 
before. Furthermore a synthetic sequence coding for a so-called endoplasmic reticulum retention 
sequence (KDEL) is added to the carboxyterminus of the gene part encoding the catalytic fragment 
for ER retention. The hybrid Mannosidase III protein encoded by this gene sequence will hence 
accumulate preferentially in the endoplasmic reticulum. 

Example 7 

In another embodiment, the aminoterminal CTS region of the human beta-1,4- 
galactosyltransferase (GalT) gene (accession A52551) is replaced by a mouse signal peptide coding 
sequence for import into the endoplasmic reticulum (see Figure 39). The signal peptide sequence 
encodes a fully active signal peptide normally present at the aminoterminus of IgG sequences and 
has been used successfully in plants and other organisms before. Furthermore a synthetic sequence 
coding for a so-called endoplasmic reticulum retention sequence (KDEL) is added to the 
carboxyterminus of the gene part encoding the catalytic fragment for ER retention. The hybrid beta- 
1,4-galactosyl-transferase protein encoded by this gene sequence will hence accumulate 
preferentially in the endoplasmic reticulum. 

Example 8 

In another embodiment, the aminoterminal CTS region of Arabidopsis thaliana GnTI (acc. 
AJ243198) is replaced by a mouse signal peptide coding sequence for import into the endoplasmic 
reticulum (see Figure 41). The signal peptide sequence encodes a fully active signal peptide 
normally present at the aminoterminus of IgG sequences and has been used successfully in plants 
and other organisms before. Furthermore a synthetic sequence coding for a so-called endoplasmic 
reticulum retention sequence (KDEL) is added to the carboxyterminus of the gene part encoding the 
catalytic fragment for ER retention. The hybrid GnTI protein encoded by this gene sequence will 
hence accumulate preferentially in the endoplasmic reticulum. 

Example 9 

In another embodiment, the aminoterminal CTS region of an Arabidopsis thaliana GnTII 
(acc. AJ249274) is replaced by a mouse signal peptide coding sequence for import into the 
endoplasmic reticulum (see Figure 43). The signal peptide sequence encodes a fully active signal 
peptide normally present at the aminoterminus of IgG sequences and has been used successfully in 
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plants and other organisms before. Furthermore a synthetic sequence coding for a so-called 
endoplasmic reticulum retention sequence (KDEL) is added to the carboxyterminus of the gene part 
encoding the catalytic fragment for ER retention. The hybrid GnTII protein encoded by this gene 
sequence will hence accumulate preferentially in the endoplasmic reticulum. 

5 

Example 10 

In another embodiment, the aminoterminal CTS region of the human gene for beta- 1,4- 
galactosyltransferase (GalT) gene is replaced by the CTS region of the human gene for GnTI 
(TmhuGnH-GalT) (see Figure 45). 

It is understood that the present invention is not limited to any particular mechanism. Nor is 
it necessary to understand the mechanism in order to successfully use the various embodiments of 
the invention. Nonetheless, it is believed that there is a sequential distribution of Golgi enzymes 
(Figure 47) and that the swapping in of transmembrane domains of plant glycosyltransferases causes 

1 5 relocalization (Figure 48). 

It is understood that the present invention is not limited to the particular methodology, 
protocols, cell lines, vectors, and reagents described herein, as these may vary. It is also to be 
understood that the terminology used herein is used for the purpose of describing particular 
embodiments only, and is not intend to limit the scope of the present invention. It must be noted that 

20 as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural 
reference unless the context clearly dictates otherwise. Unless defined otherwise, all technical and 
scientific terms used herein have the same meanings as commonly understood by one of ordinary 
skill in the art to which this invention belongs. 

The invention described and claimed herein is not to be limited in scope by the specific 

25 embodiments herein disclosed, since these embodiments are intended as illustrations of several 
aspects of the invention. Any equivalent embodiments are intended to be within the scope of this 
invention. Indeed, various modifications of the invention in addition to those shown and described 
herein will become apparent to those skilled in the art from the foregoing description. Such 
modifications are also intended to fall within the scope of the appended claims. 

30 Various references are cited herein, the disclosures of which are incorporated by reference in 

their entireties. 
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WHAT IS CLAIMED IS: 

1 . Nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a transmembrane 
region of a plant glycosyltransferase and a catalytic region of a mammalian glycosyltransferase. 

2. The nucleic acid of Claim 1, wherein said plant glycosyltransferase is a xylosyltransferase. 

3. The nucleic acid of Claim 1 , wherein said plant glycosyltransferase is a N- 
acetylglucosaminyltransferase. 

4. The nucleic acid of Claim 1, wherein said plant glycosyltransferase is a fucosyltransferase. 

5. The nucleic acid of Claim 1 , wherein said mammalian glycosyltransferase is a human 
galactosyltransferase. 

6. The nucleic acid of Claim 5, wherein said human galactosyltransferase is encoded by at least 
a portion of the nucleic acid sequence of SEQ ID NO:l . 

7. An expression vector, comprising the nucleic acid of Claim 1 . 

8. A host cell transfected with the vector of Claim 7. 

9. The host cell of Claim 8, wherein said host cell is a plant cell 

10. A cell suspension comprising the host cell of Claim 9. 

11. The hybrid enzyme expressed by the plant cell of Claim 9. 

12. The plant comprising the host cell of Claim 9. 

13. Nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a transmembrane 
region of a first glycosyltransferase and a catalytic region of a second glycosyltransferase. 

14. The nucleic acid of Claim 1 3, wherein said first glycosyltransferase comprises a plant 
glycosyltransferase 

15. The nucleic acid of Claim 14, wherein said plant glycosyltransferase is a xylosyltransferase. 

1 6. The nucleic acid of Claim 1 4, wherein said plant glycosyltransferase is a fucosyltransferase. 
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17. The nucleic acid of Claim 13, wherein said second glycosyltransferase comprises a 
mammalian glycosyltransferase. 

5 18. The nucleic acid of Claim 1 7, wherein said-mammalian glycosyltransferase is a human 
galactosyltransferase. 

1 9. The nucleic acid of Claim 13, wherein said first glycosyltransferase comprises a first 
mammalian glycosyltransferase and said second glycosyltransferase comprises a second mammalian 

1 0 glycosyltransferase. 

20. The nucleic acid of Claim 19, wherein said first mammalian glycosyltransferase is a non- 
human glycosyltransferase. 

15 21. The nucleic acid of Claim 1 9, wherein said second mammalian glycosyltransferase is a 
human glycosyltransferase. 

22. A method, comprising: 

a. providing: i) a plant cell, and ii) an expression vector comprising nucleic acid 
20 encoding a hybrid enzyme, said hybrid enzyme comprising a transmembrane region of a 

plant glycosyltransferase and a catalytic region of a mammalian glycosyltransferase; and 

b. introducing said expression vector into said plant cell under conditions such that 
said hybrid enzyme is expressed. 

25 23. The method of Claim 22, wherein said plant glycosyltransferase is a xylosyltransferase. 

24. The method of Claim 23, wherein said plant glycosyltransferase is a 
N-acetylglucosaminyltransferase. 

30 25. The method of Claim 23, wherein said plant glycosyltransferase is a fucosyltransferase. 

26. The method of Claim 22, wherein said mammalian glycosyltransferase is a human 
galactosyltransferase. 

35 27. The nucleic acid of Claim 26, wherein said human galactosyltransferase is encoded by at 
least a portion of the nucleic acid sequence of SEQ ID NO:l . 



28. A method, comprising: 
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a. providing: i) a plant cell, ii) a first expression vector comprising nucleic acid encoding a 
hybrid enzyme, said hybrid enzyme comprising a transmembrane region of a plant 
glycosyltransferase and a catalytic region of a mammalian glycosyltransferase, and iii) a 
second expression vector comprising nucleic acid encoding a heterologous glycoprotein; 

5 and 

b. introducing said first and second expression vectors into said plant cell under conditions 
such that said hybrid enzyme and said heterologous protein are expressed. 



29. The method of Claim 28, wherein said heterologous protein is an antibody or antibody 
10 fragment. 



30. A method, comprising: 

a) providing: i) a first plant comprising a first expression vector, said first vector 
comprising nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising at 

1 5 least a portion of a transmembrane region of a plant glycosyltransferase and at least a 

portion of a catalytic region of a mammalian glycosyltransferase, and ii) a second plant 
comprising a second expression vector, said second vector comprising nucleic acid 
encoding a heterologous protein; and 

b) crossing said first plant and said second plant to produce progeny expressing said hybrid 
20 enzyme and said heterologous protein. 

31. A plant, comprising first and second expression vectors, said first vector comprising nucleic 
acid encoding a hybrid enzyme, said hybrid enzyme comprising at least a portion of a 
transmembrane region of a plant glycosyltransferase and at least a portion of a catalytic region of a 

25 mammalian glycosyltransferase, said second vector comprising nucleic acid encoding a heterologous 
protein. 

32. The plant of Claim 3 1 , wherein said heterologous protein displays reduced amounts of 
fiicose as compared to when the heterologous protein is expressed in a plant in the absence of said 

30 hybrid enzyme 

33. The plant of Claim 31, wherein the heterologous protein displays reduced amounts of xylose 
as compared to when the heterologous protein is expressed in a plant in the absence of said hybrid 
enzyme. 



35 



34. The plant of Claim 31, wherein the heterologous protein displays both reduced fucose and 
xylose, as compared to when the heterologous protein is expressed in a plant in the absence of said 
hybrid enzyme. 
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35. The plant of Claim 31, wherein the heterologous protein displays complex type bi-antannery 
glycans and contains galactose residues on at least one of the arms. 

36. Nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a modified 
mammalian glycosyltransferase, wherein a transmembrane portion has been deleted and endoplasmic 
reticulum retention signal have been inserted. 

37. Nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a CTS region or 
portion thereof of a plant glycosyltransferase and a catalytic region of a mammalian 
glycosyltransferase, wherein said CTS region is from a N-acetylglucosaminyltransferase I (GnTI) 
and said catalytic region is from a mannosidase II (Manll). 

38. Nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising a CTS region or 
portion thereof of a plant glycosyltransferase and a catalytic region of a mammalian 
glycosyltransferase, wherein said CTS region or portion thereof is from a 
N-acetylglucosaminyltransferase I (GnTI) and said catalytic region is from a 
N-acetylglucosaminyltransferase II (GnTII). 

39. A plant host system, comprising (a) a nucleic acid sequence encoding a Mannosidase III 
glycosyltransferase; (b) a nucleic acid sequence encoding a hybrid enzyme, said hybrid enzyme 
comprising a CTS region or portion thereof of a plant glycosyltransferase and a catalytic domain of a 
mammalian glycosyltransferase 

40. A method, comprising (a) introducing into said plant host system a vector comprising (i) a 
nucleic acid sequence encoding a hybrid enzyme comprising a catalytic region of a 
galactosyltransferase not normally found in a plant and a transmembrane region of a protein, (ii) a 
nucleic acid sequence encoding a hybrid enzyme comprising a transmembrane region of a N- 
acetylglucosaminyltransferase I (GnTI) and a catalytic region of a mannosidase II (Manll), (iii) a 
nucleic acid sequence encoding a hybrid enzyme comprising a transmembrane region of an N- 
acetylglucosaminyltransferase I (GnTI) and a catalytic region of a N-acetylglucosaminyltransferase 
II (GnTII); and (b) isolating a plant or portion thereof expressing said nucleic acid sequences. 

41. A method, comprising: 

a. providing: i) a host cell, and ii) an expression vector comprising nucleic acid encoding a 
hybrid enzyme, said hybrid enzyme comprising a transmembrane region of a first 
glycosyltransferase and a catalytic region of a second glycosyltransferase; and 

b. introducing said expression vector into said host cell under conditions such that said 
hybrid enzyme is expressed. 
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42. The method of Claim 41, wherein said first glycosyltransferase comprises a plant 
glycosyltransferase. 

43. The method of Claim 42, wherein said plant glycosyltransferase is a xylosyltransferase. 

44. The method of Claim 42, wherein said plant glycosyltransferase is a N- 
5 acetylglucosaminyltransferase. 

45. The method of Claim 42, wherein said plant glycosyltransferase is a fucosyltransferase. 

46. The method of Claim 41, wherein said second glycosyltransferase comprises a mammalian 
glycosyltransferase. 

47. The method of Claim 46, wherein said mammalian glycosyltransferase is a human 
1 0 galactosyltransferase. 

48. A method, comprising: 

a. providing: i) a host cell, ii) a first expression vector comprising nucleic acid encoding a 
hybrid enzyme, said hybrid enzyme comprising a transmembrane region of a first 
glycosyltransferase and a catalytic region of a second glycosyltransferase, and iii) a 

1 5 second expression vector comprising nucleic acid encoding a heterologous glycoprotein; 

and 

b. introducing said first and second expression vectors into said host cell under conditions 
such that said hybrid enzyme and said heterologous protein are expressed. 

20 49. The method of Claim 48, wherein said heterologous protein is an antibody or antibody 
fragment. 

50. The method of Claim 48, further comprising the step of c) isolating said heterologous 
protein. 

51. The isolated heterologous protein produced according to the method of Claim 50. 

25 52. A host cell, comprising first and second expression vectors, said first vector comprising 
nucleic acid encoding a hybrid enzyme, said hybrid enzyme comprising at least a portion of a 
transmembrane region of a first glycosyltransferase and at least a portion of a catalytic region of a 
second glycosyltransferase, said second vector comprising nucleic acid encoding a heterologous 
protein. 

30 53. The heterologous protein isolated from the host cell of Claim 52. 
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atgaggcttcgggagccgctcctgagcggcagcgccgcgatgccaggcgcgtccctacag 

MRLREPLLSGSAA MPGASLQ 
cgggcctgccgcctgctcgtggccgtctgcgctctgcaccttggcgtcaccctcgtttac 

RACRLLVAVCALHLGVTLVY 
tacctggctggccgcgacctgagccgcctgccccaactggtcggagtctccacaccgctg 

YLAGRDLSRLPQLVGVSTPL. 
cagggcggctcgaacagtgccgccgccatcgggcagtcctccggggagctccggaccgga 

QGGSNSAA-AIGQS SGELRTG 
ggggcccggccgccgcctcctctaggcgcctcctcccagccgcgcccgggtggcgactcc 

GARPPPPLGASSQPRPGGDS 
agcccagtcgtggattctggccctggccccgctagcaacttgacctcggtcccagtgccc 

SPVVDSGPGPASNLTSVPVP 
cacaccaccgcactgtcgctgcccgcctgccctgaggagtccccgctgcttgtgggcccc 

HTTALSLPACPEESPLLVGP 
atgctgattgagtttaacatgcctgtggacctggagctcgtggcaaagcagaacccaaat 

MLI EFNMPVDLELVAKQNPN 
gtgaagatgggcggccgctatgcccccagggactgcgtctctcctcacaaggtggccatc 

VKMGGRYAPRDCVSPHKVAI 
atcattccattccgcaaccggcaggagcacctcaagtactggctatattatttgcaccca 

I I PFRNRQEHLKYWLYYLHP 
gtcctgcagcgccagcagctggactatggcatctatgttatcaaccaggcgggagacact 

VLQRQQLDYGIYVINQAGDT 
atattcaatcgtgctaagctcctcaatgttggctttcaagaagccttgaaggactatgac 

I FNRA KLLNVGFQEALKDYD 
tacacctgctttgtgtttagtgacgtggacctcattccaatgaatgaccataatgcgtac 

YTCFVFS DVDLI PMNDHNAY 
aggtgtttttcacagccacggcacatttccgttgcaatggataagtttggattcagccta 

RCFSQPRHISVAMDKFGFSL 
Ccttatgttcagtattttggaggtgtctctgctctaagtaaacaacagtttctaaccatc 

PYVQYFGGVSALSKQQFLTI 
aatggatttcctaataattattggggctggggaggagaagatgatgacatttttaacaga 

NGFPNNYWGWGGEDDDIFNR 
ttagtttttagaggcatgtctatatctcgcccaaatgctgtggtcgggaggtgtcgcatg 

LVFRGMS I'SRPNAVVGRCRM 
atccgccactcaagagacaagaaaaatgaacccaatcctcagaggtttgaccgaattgca 

IRHSRDKKNEPNPQRFDRIA 
cacacaaaggagacaatgctctctgatggtttgaactcactcacctaccaggtgctggat 

HTKE, TMLSDGLNSLTYQVLD 
gtacagagatacccattgtatacccaaatcacagtggacatcgggacaccgag ctag 

VQRYPLYTQITVDIGTPS- 



FIG.6 
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atgagtaaacggaatccgaagattctgaagatttttctgtatatgttacttctcaactct 

MSKRNPKILKI FLYMLLLNS 
ctctttctcatcatctacttcgtttttcactcatcgtcgttttcaccggagcagtcacag 

LFLI IYFVFHSSSF SPEQSQ 
cctcctcatatataccacgtttcagtgaataaccaatcggcgatcgggcagtcctccggg 

PPHIYHVSVNN'Q SAI GQSSG 
gagctccggaccggaggggcccggccgccgcctcctctaggcgcctcctcccagccgcgc 

ELRTGGARPPPPLGASSQPR 
ccgggtggcgactccagcccagtcgtggattctggccctggccccgctagcaacttgacc 

PGGDSSPVVDSGPGPASNLT 
tcggtcccagtgccccacaccaccgcactgtcgctgcccgcctgccctgaggagtccccg 

SVPVPHTTALSLPACPEESP 
Ctgcttgtgggccccatgctgattgagtttaacatgcctgtggacctggagctcgtggc 
LLVGPMLI EFNMPVDLELVA 
Aagcagaacccaaatgtgaagatgggcggccgctatgcccccagggactgcgtctctcct 

KQNPNVKMGGRYAPRDCVSP 
cacaaggtggccatcatcattccattccgcaaccggcaggagcacctcaagtactggcta 

HKVAI IIPFRNRQEHLKYWL 
tattatttgcacccagtcctgcagcgccagcagctggactatggcatctatgttatcaac 

YYLHPVLQRQQLDYGIYVIN 
caggcgggagacactatattcaatcgtgctaagctcctcaatgttggctttcaagaagcc 

QAGDTIFNRAKLLNVGFQEA 
ttgaaggactatgactacacctgctttgtgtttagtgacgtggacctcattccaatgaat 

LKDYDYTCFVFS DVDLI PMN 
gaccataatgcgtacaggtgtttttcacagccacggcacatttccgttgcaatggataag 

DHNAYRCFSQPRHISVAMDK 
tttggattcagcctaccttatgttcagtattttggaggtgtctctgctctaagtaaacaa 

FGFSLPYVQYFGGVSALSKQ 
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SEQUENCE LISTING 

<110> PLANT RESEARCH INTERNATIONAL BV 
BAKKER-. Hendrikus A-C- 
FLORACK-i Dionisius E-A- 
BOSCHi Hendrik J • 
ROI1IENDAL1 Gerard J - A • 

<120> Optimizing glycan processing in plants 

<130> beatlA - PD334fibtJ0 

<15Q> US-faO/3fc.5«,735 
<151> 2002-03-11 

<lbO> 5=1 

<170> Patentln version 3.2 



<21D> 
<2U> 
<212> 



1 

in? 

DNA 



<213> Homo sapiens 



<M0O> 1 

atgaggcttc 

cgggcctgcc 

tacctggctg 

cagggcggct 

ggggcccggc 

agcccagtcg 

cacaccaccg 

atgctgattg 

gtgaagatgg 

atcattccat 

gtcctgcagc 

atattcaatc 

tacacctgct 

aggtgttttt 

ccttatgttc 

aatggatttc 

ttagttttta 

atccgccact 

cacacaaagg 

gtacagagat 



gggagccgct 
gcctgctcgt 
gccgcgacct 
cgaacagtgc 
cgccgcctcc 
tggattctgg 
cactgtcgct 
agtttaacat 
gcggccgcta 
tccgcaaccg 
gccagcagct 
gtgctaagct 
ttgtgtttag 
cacagccacg 
agtattttgg 
ctaataatta 
gaggcatgtc 
caagagacaa 
agacaatgct 
acccattgta 



cctgagcggc 
ggccgtctgc 
gagccgcctg 
cgccgccatc 
tctaggcgcc 
ccctggcccc 
gcccgcctgc 
gcctgtggac 
tgcccccagg 
gcaggagcac 
ggactatggc 
cctcaatgtt 
tgacgtggac 
gcacatttcc 
aggtgtctct 
ttggggctgg 
tatatctcgc 
gaaaaatgaa 
ctctgatggt 
tacccaaatc 



agcgccgcga 
gctctgcacc 
ccccaactgg 
gggcagtcct 
tcctcccagc 
gctagcaact 
cctgaggagt 
ctggagctcg 
gactgcgtct 
ctcaagtact 
atctatgtta 
ggctttcaag 
ctcattccaa 
gttgcaatgg 
gctctaagta 
ggaggagaag 
ccaaatgctg 
cccaatcctc 
ttgaactcac 
acagtggaca 



tgccaggcgc 
ttggcgtcac 
tcggagtctc 
ccggggagct 
cgcgcccggg 
tgacctcggt 
ccccgctgct 
tggcaaagca 
ctcctcacaa 
ggctatatta 
tcaaccaggc 
aagccttgaa 
tgaatgacca 
ataagtttgg 
aacaacagtt 
atgatgacat 
tggtcgggag 
agaggtttga 
tcacctacca 
tcgggacacc 



gtccctacag 
cctcgtttac 
cacaccgctg 
ccggaccgga 
tggcgactcc 
cccagtgccc 
tgtgggcccc 
gaacccaaat 
ggtggccatc 
tttgcaccca 
gggagacact 
ggactatgac 
taatgcgtac 
attcagccta 
tctaaccatc 
ttttaacaga 
gtgtcgcatg 
ccgaattgca 
ggtgctggat 
gagctag 



to 

120 
IfiO 
2M0 
300 
3b0 
M20 

Mao 

S40 

too 

bbO 
720 
760 
6M0 
lOO 

uo 

1020 

ioao 
hmo 
in? 



<210> 2 

<2ii> 3^a 

<212> PRT 

<213> Homo sapiens 

<M00> 2 

Met Arg Leu Arg Glu Pro Leu Leu Ser Gly Ser Ala Ala Met Pro Gly 
IS 10 15 

Ala Ser Leu Gin Arg Ala Cys Arg Leu Leu Val Ala Val Cys Ala Leu 
20 25 30 

His Leu Gly Val Thr Leu Val Tyr Tyr Leu Ala Gly Arg Asp Leu Ser 
35 MO MS 

Arg Leu Pro Gin Leu Val Gly Val Ser Thr Pro Leu Gin Gly Gly Ser 
50 55 tO 



Asn Ser Ala Ala Ala lie Gly Gin Ser Ser Gly Glu Leu Arg Thr Gly 
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2/38 

fc,5 70 7S 60 

Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala Ser Ser Gin Pro Arg Pro 
85 10 15 

Gly Gly Asp Ser Ser Pro Val Val Asp Ser Gly Pro Gly Pro Ala Ser 
100 105 110 

Asn Leu Thr Ser Val Pro Val Pro His Thr Thr Ala Leu Ser Leu Pro 
115 ISO 125 

Ala Cys Pro Glu Glu Ser Pro Leu Leu Val Gly Pro Met Leu lie Glu 
130 135 mo 

Phe Asn Met Pro Val Asp Leu Glu Leu Val Ala Lys Gin Asn Pro Asn 
IMS 150 155 IbO 

Val Lys Met Gly Gly Arg Tyr Ala Pro Arg Asp Cys Val Ser Pro His 
IbS 170 175 

Lys Val Ala lie lie lie Pro Phe Arg Asn Arg Gin Glu His Leu Lys 

160 ias no 

Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu Gin Arg Gin Gin Leu Asp 
115 500 205 

Tyr Gly lie Tyr Val He Asn Gin Ala Gly Asp Thr He Phe Asn Arg 
510 515 220 

Ala Lys Leu Leu Asn Val Gly Phe Gin Glu Ala Leu Lys Asp Tyr Asp 
225 230 235 240 

Tyr Thr Cys Phe Val Phe Ser Asp Val Asp Leu He Pro Net Asn Asp 
245 250 255 

His Asn Ala Tyr Arg Cys Phe Ser Gin Pro Arg His He Ser Val Ala 
SbQ 2b5 270 

Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr Val Gin Tyr Phe Gly Gly 
275 250 2fl5 

Val Ser Ala Leu Ser Lys Gin Gin Phe Leu Thr He Asn Gly Phe Pro 
210 215 300 

Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp Asp Asp He Phe Asn Arg 
305 310 315 320 

Leu Val Phe Arg Gly Met Ser He Ser Arg Pro Asn Ala Val Val Gly 
325 330 335 

Arg Cys Arg Met He Arg His Ser Arg Asp Lys Lys Asn Glu Pro Asn 
340 345 350 

Pro Gin Arg Phe Asp Arg He Ala His Thr Lys Glu Thr Net Leu Ser 
355 3fe.O 3t5 

Asp Gly Leu Asn Ser Leu Thr Tyr Gin Val Leu Asp Val Gin Arg Tyr 
370 375 3A0 

Pro Leu Tyr Thr Gin He Thr Val Asp He Gly Thr Pro Ser 
3A5 310 315 

<210> 3 

<211> 1152 

<212> DNA 

<213> hybrid 
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<4Q0> 3 

atgagtaaac ggaatccgaa gattctgaag 
ctctttctca tcatctactt cgtttttcac 
cctcctcata tataccacgt ttcagtgaat 
gagctccgga ccggaggggc ccggccgccg 
ccgggtggcg actccagccc agtcgtggat 
tcggtcccag tgccccacac caccgcactg 
ctgcttgtgg gccccatgct gattgagttt 
aagcagaacc caaatgtgaa gatgggcggc 
cacaaggtgg ccatcatcat tccattccgc 
tattatttgc acccagtcct gcagcgccag 
caggcgggag acactatatt caatcgtgct 
ttgaaggact atgactacac ctgctttgtg 
gaccataatg cgtacaggtg tttttcacag 
tttggattca gcctacctta tgttcagtat 
cagtttctaa ccatcaatgg atttcctaat 
gacattttta acagattagt ttttagaggc 
gggaggtgtc gcatgatccg ccactcaaga 
tttgaccgaa ttgcacacac aaaggagaca 
taccaggtgc tggatgtaca gagataccca 
acaccgagct ag 



atttttctgt atatgttact tctcaactct bO 

tcatcgtcgt tttcaccgga gcagtcacag 1BD 

aaccaatcgg cgatcgggca gtcctccggg IfiO 

cctcctctag gcgcctcctc ccagccgcgc 240 

tctggccctg gccccgctag caacttgacc 3QD 

tcgctgcccg cctgccctga ggagtccccg 3b0 

aacatgcctg tggacctgga gctcgtggca 420 

cgctatgccc ccagggactg cgtctctcct 4BQ 

aaccggcagg agcacctcaa gtactggcta 540 

cagctggact atggcatcta tgttatcaac bDO 

aagctcctca atgttggctt tcaagaagcc bbO 

tttagtgacg tggacctcat tccaatgaat 720 

ccacggcaca tttccgttgc aatggataag 7B0 

tttggaggtg tctctgctct aagtaaacaa AMD 

aattattggg gctggggagg agaagatgat TOO 

atgtctatat ctcgcccaaa tgctgtggtc SbO 

gacaagaaaa atgaacccaa tcctcagagg 1020 

atgctctctg atggtttgaa ctcactcacc lOfiO 

ttgtataccc aaatcacagt ggacatcggg 1140 

1152 



<210> 4 

<211> 3fl3 

<212> PRT 

<213> hybrid 

<400> 4 

net Ser Lys Arg Asn Pro Lys He Leu Lys He Phe Leu Tyr Met Leu 
1*5 .10 15 

Leu Leu Asn Ser Leu Phe Leu He He Tyr Phe Val Phe His Ser Ser 
20 25 30 

Ser Phe Ser Pro Glu Gin Ser Gin Pro Pro His He Tyr His Val Ser 
35 40 45 

Val Asn Asn Gin Ser Ala He Gly Gin Ser Ser Gly Glu Leu Arg Thr 
SO 55 bO 

Gly Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala Ser Ser Gin Pro Arg 
b5 70 75 60 

Pro Gly Gly Asp Ser Ser Pro Val Val Asp Ser Gly Pro Gly Pro Ala 
fi5 TO T5 

Ser Asn Leu Thr Ser Val Pro Val Pro His Thr Thr Ala Leu Ser Leu 
100 105 110 

Pro Ala Cys Pro Glu Glu Ser Pro Leu Leu Val Gly Pro (let Leu He 
115 120 12S 

Glu Phe Asn Met Pro Val Asp Leu Glu Leu Val Ala Lys Gin Asn Pro 
130 135 140 

Asn Val Lys Met Gly Gly Arg Tyr Ala Pro Arg Asp Cys Val Ser Pro 
14S J 150 155 lbO 

His Lys Val Ala He He He Pro Phe Arg Asn Arg Gin Glu His Leu 
lb5 170 175 

Lys Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu Gin Arg Gin Gin Leu 

180 ias no 



Asp Tyr Gly He Tyr Val He Asn Gin Ala Gly Asp Thr He Phe Asn 
115 200 205 
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Arg Ala Lys Leu Leu Asn Val Gly Phe Gin Glu Ala Leu Lys Asp Tyr 
210 215 520 

Asd Tvr Thr Cys Phe Val Phe Ser Asp Val Asp Leu He Pro Met Asn 
225 230 235 240 

Asp His Asn Ala Tyr Arg Cys Phe Ser Gin Pro Arg His He Ser Val 
EMS 250 255 

Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr Val Gin Tyr Phe Gly 
2b0 2t5 270 

Gly Val Ser Ala Leu Ser Lys Gin Gin Phe Leu Thr He Asn Gly Phe 
275 2fl0 265 

Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp Asp Asp He Phe Asn 
2^0 * 2^5 300 

Arq Leu Val Phe Arg Gly Met Ser He Ser Arg Pro Asn Ala Val Val 
305 310 315 320 

Gly Arg Cys Arg Met He Arg His Ser Arg Asp Lys Lys Asn Glu Pro 
325 330 335 

Asn Pro Gin Arg Phe Asp Arg He Ala His Thr Lys Glu Thr tlet Leu 
340 345 350 

Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gin Val Leu Asp Val Gin Arg 
355 3b0 3b5 

Tyr Pro Leu Tyr Thr Gin He Thr Val Asp He Gly Thr Pro Ser 
370 375 3fl0 



<210> 5 

<211> lb42 

<212> DNA 

<213> Homo sapiens 



<400> 5 

ccatggtgat 

tctccttcct 

ccctcagccc 

ccagccccga 

tgctgcagcc 

tgcccgagga 

ccggcaccaa 

ccaacggctc 

ggggccgagg 

ccagctgcgg 

tgcccaggga 

tgctggacgt 

ccaacttcac 

gcaccttcga 

gcggccggca 

gcgtctcgcg 

agatcccggc 

tcgccttcca 

aggtggtgtc 

gcctgcgccg 

gccacatcct 

cctggtgctt 

tcccacgctg 

gcaccggggg 

acatgtatgc 

acccctacca 

gaaggccgcc 



gagacgctac 
gcacttcttc 
taacctggtg 
gccaggaggc 
gctgccgccc 
caccaccgag 
gatgctggag 
ctcggcccgg 
cgcccggcgc 
cgtgcccact 
ggtgccgcgc 
gcgcttccac 
ggcttatggg 
gtacatccgc 
ggacggctgg 
gctgcgcaac 
ccgtgacggc 
catgcgcaag 
aggctgcacg 
ccgccagtac 
ggtgcagtgg 
cacgcccgag 
g9gtgactac 
ctggttcgac 
gcccaagtac 
ggagcccagg 
cgcccggggc 



aagctctttc 
aagaccctgt 
tccagctttt 
cctgacctgc 
agcaaggcgg 
tatttcgtgc 
aggccgcccc 
cggccacccc 
aagtgggtgg 
gtggtgcagt 
cgcgtcatca 
gagctgggcg 
gagccgcggc 
cacaaggtgc 
atcgccgacg 
ctgcggcccg 
gtccttttcc 
tcgctctacg 
gtggacatgc 
tacaccatgc 
tcgctgggca 
ggcatctact 
gaggacaagc 
ggcacgcagc 
ctgctgaaga 
agcacggcgg 
aaactggacg 



tcatgttctg 
cctatgtcac 
tctggaacaa 
tgcgtacccc 
ccgaggagct 
gcaccaaggc 
cgggacggcc 
ggtacctcct 
agtgcgtgtg 
actccaacct 
acgccatcaa 
acgtggtgga 
cgctcaagtt 
tctatgtctt 
actacctgcg 
acgacgtctt 
tcaagctcta 
gcttcttctg 
tgcaggcagt 
ccaacttcag 
gccccctgca 
tcaagctcgt 
gggacctgaa 
aggagtaccc 
actacgaccg 
cgggcgggtg 
aggcggaagt 



tatggccggc 
cttcccccga 
tgccccggtc 
actctactcc 
ccaccgggtg 
cggcggcgtc 
ggaggagaag 
gagcgcccgg 
cctgcccggc 
gcccaccaag 
cgtcaaccac 
cgcctttgtg 
ccgggagatg 
cctggaccac 
caccttcctc 
catcattgac 
cgatggctgg 
gaagcagccg 
gtatgggctg 
acagtatgag 
cttcgccggc 
gtccgcccag 
ctacatccgc 
gcctgcagac 
gttccactac 
gcgccacagg 
cgaacaaaaa 



ctgtgcctca 
gaactggcct 
acgccccagg 
cactcgcccc 
gacttggtgc 
tgcttcaaac 
cctgaggggg 
gagcgcacgg 
tggcacggac 
gagcggctgg 
gagttcgacc 
gtgtgcgagt 
ctgaccaatg 
ttcccgcccg 
acccaggacg 
gatgcggacg 
accgagccct 
ggcaccctgg 
gacggcatcc 
aaccgcaccg 
tggcactgct 
aatggcgact 
ggcctgatcc 
cccagcgagc 
ctgctggaca 
ggtcccgagg 
ctcatctcag 



bO 
120 
IflD 
210 
3D0 
3^0 
420 
MfiO 
540 
LOO 
bbO 
720 
7fl0 
B40 

noo 
ibo 

1020 
IDflQ 
1140 
1200 
12b0 
1320 
1360 
1440 
1500 
15b0 
lb20 
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aagaggatct gaattaggat cc lbM2 

<S1Q> t, 

<211> 5M4 

<212> PRT 

<213> Homo sapiens 

<4DD> h 

Met Val Het Arg Arg Tyr Lys Leu Phe Leu Met Phe Cys (let Ala Gly 
15 10 15 

Leu Cys Leu He Ser Phe Leu His Phe Phe Lys Thr Leu Ser Tyr Val 
2D 55 30 

Thr Phe Pro Arg Glu Leu Ala Ser Leu Ser Pro Asn Leu Val Ser Ser 
35 MO MS 

Phe Phe Trp Asn Asn Ala Pro Val Thr Pro Gin Ala Ser Pro Glu Pro 
50 55 t,0 

Gly Gly Pro Asp Leu Leu Arg Thr Pro Leu Tyr Ser His Ser Pro Leu 
b5 7D 75 fiO 

Leu Gin Pro Leu Pro Pro Ser Lys Ala Ala Glu Glu Leu His Arg Val 
65 10 15 

Asp Leu Val Leu Pro Glu Asp Thr Thr Glu Tyr Phe Val Arg Thr Lys 
100 105 110 

Ala Gly Gly Val Cys Phe Lys Pro Gly Thr Lys (let Leu Glu Arg Pro 
115 150 125 

Pro Pro Gly Arg Pro Glu Glu Lys Pro Glu Gly Ala Asn Gly Ser Ser 
130 135 mo 

Ala Arg Arg Pro Pro Arg Tyr Leu Leu Ser Ala Arg Glu Arg Thr Gly 
IMS 150 155 IbO 

Gly Arg Gly Ala Arg Arg Lys Trp Val Glu Cys Val Cys Leu Pro Gly 
lb5 170 175 

Trp His Gly Pro Ser Cys Gly Val Pro Thr Val Val Gin Tyr Ser Asn 

iao ifis no 

Leu Pro Thr Lys Glu Arg Leu Val Pro Arg Glu Val Pro Arg Arg Val 
115 H00 205 

He Asn Ala He Asn Val Asn His Glu Phe Asp Leu Leu Asp Val Arg 
210 215 220 

Phe His Glu Leu Gly Asp Val Val Asp Ala Phe Val Val Cys Glu Ser 
225 230 235 240 

Asn Phe Thr Ala Tyr Gly Glu Pro Arg Pro Leu Lys Phe Arg Glu Met 
2M5 250 255 

Leu Thr Asn Gly Thr Phe Glu Tyr He Arg His Lys Val Leu Tyr Val 
2b0 2bS 570 

Phe Leu Asp His Phe Pro Pro Gly Gly Arg Gin Asp Gly Trp He Ala 
275 2AD 265 

Asp Asp Tyr Leu Arg Thr Phe Leu Thr Gin Asp Gly Val Ser Arg Leu 
210 215 300 

Arg Asn Leu Arg Pro Asp Asp Val Phe He He Asp Asp Ala Asp Glu 
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305 31D 315 3BD 

He Pro Ala Arg Asp Gly Val Leu Phe Leu Lys Leu Tyr Asp Gly Trp 
32S 33D 335 

Thr Glu Pro Phe Ala Phe His Met Arg Lys Ser Leu Tyr Gly Phe Phe 
340 345 350 

TrD Lys Gin Pro Gly Thr Leu Glu Val Val Ser Gly Cys Thr Val Asp 
355 3t0 3b5 

Met Leu Gin Ala Val Tyr Gly Leu Asp Gly He Arg Leu Arg Arg Arg 
370 375 3fl0 

Gin Tyr Tyr Thr ilet Pro Asn Phe Arg Gin Tyr Glu Asn Arg Thr Gly 
3AS 310 3^5 400 

His He Leu Val Gin Trp Ser Leu Gly Ser Pro Leu His Phe Ala Gly 
405 410 41S 

Trp His Cys Ser Trp Cys Phe Thr Pro Glu Gly He Tyr Phe Lys Leu 
M20 4H5 430 

Val Ser Ala Gin Asn Gly Asp Phe Pro Arg Trp Gly Asp Tyr Glu Asp 
i\3S 440 445 

Lvs Arg Asp Leu Asn Tyr He Arg Gly Leu He Arg Thr Gly Gly Trp 
450 455 4b0 

Phe Asp Gly Thr Gin Gin Glu Tyr Pro Pro Ala Asp Pro Ser Glu His 
4fc,5 470 475 4AQ 

Met Tyr Ala Pro Lys Tyr Leu Leu Lys Asn Tyr Asp Arg Phe His Tyr 
i*flS 4=50 415 

Leu Leu Asp Asn Pro Tyr Gin Glu Pro Arg Ser Thr Ala Ala Gly Gly 
500 505 510 

Trp Arg His Arg Gly Pro Glu Gly Arg Pro Pro Ala Arg Gly Lys Leu 
515 5E0 525 

Asd Glu Ala Glu Val Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn 
530 535 54D 



<210> 7 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Synthetic 
<400> 7 

Glu Gin Lys Leu He Ser Glu Glu Asp Leu 
15 ID 



<2io> a 

<211> 31 

<212> PRT 

<213> Homo sapiens 

<400> A 



Gin Glu Pro Arg Ser Thr Ala Ala Gly Gly Trp Arg His Arg Gly Pro 
15 10 IS 
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Glu Gly Arg Pro Pro Ala Arg Gly Lys Leu Asp Glu Ala Glu Val 
2D SS 3D 



<210> 1 

<211> IblM 

<212> DNA 

<213> hybrid 



<MQ0> 1 

catgagtaaa 

tctctttctc 

gcctcctcat 

gctgcgtacc 

ggccgaggag 

gcgcaccaag 

cccgggacgg 

ccggtacctc 

ggagtgcgtg 

gtactccaac 

caacgccatc 

cgacgtggtg 

gccgctcaag 

gctctatgtc 

cgactacctg 

cgacgacgtc 

cctcaagctc 

cggcttcttc 

gctgcaggca 

gcccaacttc 

cagccccctg 

cttcaagctc 

gcgggacctg 

gcaggagtac 

gaactacgac 

ggcgggcggg 

cgaggcggaa 



cggaatccga 
atcatctact 
atataccacg 
ccactctact 
ctccaccggg 
gccggcggcg 
ccggaggaga 
ctgagcgccc 
tgcctgcccg 
ctgcccacca 
aacgtcaacc 
gacgcctttg 
ttccgggaga 
ttcctggacc 
cgcaccttcc 
ttcatcattg 
tacgatggct 
tggaagcagc 
gtgtatgggc 
agacagtatg 
cacttcgccg 
gtgtccgccc 
aactacatcc 
ccgcctgcag 
cggttccact 
tggcgccaca 
gtcgaacaaa 



agattctgaa 
tcgtttttca 
tttcagtgaa 
cccactcgcc 
tggacttggt 
tctgcttcaa 
agcctgaggg 
gggagcgcac 
gctggcacgg 
aggagcggct 
acgagttcga 
tggtgtgcga 
tgctgaccaa 
acttcccgcc 
tcacccagga 
acgatgcgga 
ggaccgagcc 
cgggcaccct 
tggacggcat 
agaaccgcac 
gctggcactg 
agaatggcga 
gcggcctgat 
accccagcga 
acctgctgga 
ggggtcccga 
aactcatctc 



gatttttctg 
ctcatcgtcg 
taaccaatcg 
cctgctgcag 
gctgcccgag 
acccggcacc 
ggccaacggc 
ggggggccga 
acccagctgc 
ggtgcccagg 
cctgctggac 
gtccaacttc 
tggcaccttc 
cggcggccgg 
cggcgtctcg 
cgagatcccg 
cttcgccttc 
ggaggtggtg 
ccgcctgcgc 
cggccacatc 
ctcctggtgc 
cttcccacgc 
ccgcaccggg 
gcacatgtat 
caacccctac 
gggaaggccg 
agaagaggat 



tatatgttac 
ttttcaccgg 
gcacatggag 
ccgctgccgc 
gacaccaccg 
aagatgctgg 
tcctcggccc 
ggcgcccggc 
ggcgtgccca 
gaggtgccgc 
gtgcgcttcc 
acggcttatg 
gagtacatcc 
caggacggct 
cggctgcgca 
gcccgtgacg 
cacatgcgca 
tcaggctgca 
cgccgccagt 
ctggtgcagt 
ttcacgcccg 
tggggtgact 
ggctggttcg 
gcgcccaagt 
caggagccca 
cccgcccggg 
ctgaattagg 



ttctcaactc 
agcagtcaca 
gccctgacct 
ccagcaaggc 
agtatttcgt 
agaggccgcc 
ggcggccacc 
gcaagtgggt 
ctgtggtgca 
gccgcgtcat 
acgagctggg 
gggagccgcg 
gccacaaggt 
ggatcgccga 
acctgcggcc 
gcgtcctttt 
agtcgctcta 
cggtggacat 
actacaccat 
ggtcgctggg 
agggcatcta 
acgaggacaa 
acggcacgca 
acctgctgaa 
ggagcacggc 
gcaaactgga 
atcc 



bO 
120 

iao 

2M0 
300 
3b0 
M20 
MfiD 
5M0 

too 

bbO 
720 

?ao 

AMD 
100 
IbO 
1020 

ioao 
imo 

1200 
12b0 
1320 
13fl0 
1MM0 
1500 
15b0 
IblM 



<210> 10 

<211> 535 

<212> PRT 

<213> hybrid 

<M00> 10 



Met 

1 


Ser 


Lys 


Arg 


Asn 
5 


Pro 


Lys 


He 


Leu 


Lys 
10 


He 


Phe 


Leu 


Tyr 


net 

15 


Leu 


Leu 


Leu 


Asn 


Ser 
20 


Leu 


Phe 


Leu 


He 


He 
25 


Tyr 


Phe 


Val 


Phe 


His 
30 


Ser 


Ser 


Ser 


Phe 


Ser 
35 


Pro 


Glu 


Gin 


Ser 


Gin 
MO 


Pro 


Pro 


His 


He 


Tyr 
M5 


His 


Val 


Ser 


Val 


Asn 
50 


Asn 


Gin 


Ser 


Ala 


His 
55 


Gly 


Gly 


Pro 


Asp 


Leu 
bO 


Leu 


Arg 


Thr 


Pro 


Leu 
b5 


Tyr 


Ser 


His 


Ser 


Pro 
70 


Leu 


Leu 


Gin 


Pro 


Leu 
75 


Pro 


Pro 


Ser 


Lys 


Ala 
AO 


Ala 


Glu 


Glu 


Leu 


His 
fiS 


Arg 


Val 


Asp 


Leu 


Val 
10 


Leu 


Pro 


Glu 


Asp 


Thr 
15 


Thr 


Glu 


Tyr 


Phe 


Val 
100 


Arg 


Thr 


Lys 


Ala 


Gly 
105 


Gly 


Val 


Cys 


Phe 


Lys 
110 


Pro 


Gly 
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Thr Lys flet Leu Glu Arg Pro Pro Pro Gly Arg Pro Glu Glu Lys Pro 
US ISO 125 

Glu Gly Ala Asn Gly Ser Ser Ala Arg Arg Pro Pro Arg Tyr Leu Leu 

130 135 mo 

Ser Ala Arg Glu Arg Thr Gly Gly Arg Gly Ala Arg Arg Lys Trp Val 
IMS ISO 155 lfc.0 

Glu Cys Val Cys Leu Pro Gly Trp His Gly Pro Ser Cys Gly Val Pro 
lb5 170 175 

Thr Val Val Gin Tyr Ser Asn Leu Pro Thr Lys Glu Arg Leu Val Pro 
160 1A5 no 

Arg Glu Val Pro Arg Arg Val lie Asn Ala lie Asn Val Asn His Glu 
115 200 205 

Phe Asp Leu Leu Asp Val Arg Phe His Glu Leu Gly Asp Val Val Asp 
210 215 220 

Ala Phe Val Val Cys Glu Ser Asn Phe Thr Ala Tyr Gly Glu Pro Arg 
225 230 235 240 

Pro Leu Lys Phe Arg Glu Met Leu Thr Asn Gly Thr Phe Glu Tyr lie 
245 250 255 

Arg His Lys Val Leu Tyr Val Phe Leu Asp His Phe Pro Pro Gly Gly 
2bD 2b5 270 

Arg Gin Asp Gly Trp lie Ala Asp Asp Tyr Leu Arg Thr Phe Leu Thr 
275 2fl0 255 

Gin Asp Gly Val Ser Arg Leu Arg Asn Leu Arg Pro Asp Asp Val Phe 
210 2^5 300 

He He Asp Asp Ala Asp Glu He Pro Ala Arg Asp Gly Val Leu Phe 
305 310 315 320 

Leu Lys Leu Tyr Asp Gly Trp Thr Glu Pro Phe Ala Phe His Met Arg 
325 330 335 

Lys Ser Leu Tyr Gly Phe Phe Trp Lys Gin Pro Gly Thr Leu Glu Val 
340 345 350 

Val Ser Gly Cys Thr Val Asp Plet Leu Gin Ala Val Tyr Gly Leu Asp 
355 3b0 3b5 

Gly He Arg Leu Arg Arg Arg Gin Tyr Tyr Thr Met Pro Asn Phe Arg 
370 375 380 

Gin Tyr Glu Asn Arg Thr Gly His He Leu Val Gin Trp Ser Leu Gly 
3fi5 310 315 MOO 

Ser Pro Leu His Phe Ala Gly Trp His Cys Ser Trp Cys Phe Thr Pro 
405 M10 415 

Glu Gly He Tyr Phe Lys Leu Val Ser Ala Gin Asn Gly Asp Phe Pro 
42Q 425 430 

Arg Trp Gly Asp Tyr Glu Asp Lys Arg Asp Leu Asn Tyr He Arg Gly 
i*35 440 445 

Leu He Arg Thr Gly Gly Trp Phe Asp Gly Thr Gin Gin Glu Tyr Pro 
450 455 MbQ 

Pro Ala Asp Pro Ser Glu His Plet Tyr Ala Pro Lys Tyr Leu Leu Lys 
4b5 4?0 475 460 
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Asn Tyr Asp Arg Phe His Tyr Leu Leu Asp Asn Pro Tyr Gin Glu Pro 
MBS MTD 

Arq Ser Thr Ala Ala Gly Gly Trp Arg His Arg Gly Pro Glu Gly Arg 
500 SOS 510 

Pro Pro Ala Arg Gly Lys Leu Asp Glu Ala Glu Val Glu Gin Lys Leu 
SIS 520 525 

He Ser Glu Glu Asp Leu Asn 
530 535 



<210> 11 

<211> 13 

<212> DNA 

<B13> Artificial Sequence 
<220> 

<223> Synthetic 

<MD0> 11 
aatacttcca ccc 



<B1D> 12 

<211> 3M 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<M00> 12 

ccacccgtta acaatgaaga tgagacgcta caag 



34 



<210> 13 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<M0D> 13 

gggccatgga gatgagacgc tacaagctc 



<210> 1M 

<211> 2fl 

<212> DNA 

<213> Artificial Sequence 
<2B0> 

<223> Synthetic 

<MD0> 1M 

ggatccaatg aagatgagac gctacaag 



<210> IS 

<2ii> m 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Synthetic 
<M0Q> IS 

gggcccggga gatcctaatt cagatcctct tctgagatga 



<S10> lb 

<S11> 3S 

<212> DNA 

<S13> Artificial Sequence 
<220> 

<223> Synthetic 

<MQO> lb 

cccggatcct aattcagatc ctcttctgag atgag 



<21D> 17 

<211> 3B 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<MOD> 17 

gggtctagat cctaattcag atcctcttct gagatgag 



<210> Ifl 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<M00> la 

ccacccgtta acaatgagta aacggaatcc gaaga 



<21D> M 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<MDO> M 

gggccatggg taaacggaat ccgaagattc tgaag 



<210> 2D 

<211> 34 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<i4Q0> 2D 

cccggatcca tgagtaaacg gaatccgaag attc 



<210> 21 
<211> 21 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 
<400> 21 

gcgccccggg acgctagctc ggtgtcccg 



<210> 22 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<MOD> 22 

cccggatcca cgctagctcg gtgtc 



<210> 23 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<MDD> 23 

gggtctagat ccacgctagc tcggtgtccc g 



<210> 24 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<MDD> 24 

ccacccgtta acaatgaggc ttcgggagcc gctcctgag 



<21D> 25 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<MOQ> 25 

gggccatggg gcttcgggag ccgctcctga g 



<210> 2b 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<MOD> 2b 

cccggatcca tgaggcttcg ggagccgctc ctgag 
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<21Q> 27 

<211> 7155 

<E12> I>NA 

<213> hybrid 



<10D> 27 
ggcgcgcctc 
ttgatttttg 
acaaagatta 
atctttagga 
aactcgttat 
gaatatgttc 
gaccgtattg 
atgcagtaac 
tatttccatt 
ccatgtacga 
ggggtaccac 
tctccttacc 
gtggttaatg 
tgagccacag 
gttagatagc 
attcatagaa 
tgagatttct 
agacgcaatc 
ctagtcaaat 
tcgaagatat 
cgtttgaaaa 
gcgccgtcgt 
atggtggtcc 
agaagctcaa 
tagaggagta 
ctaaggtatg 
tcccgtgaac 
acttctttgc 
gttcttttgg 
gagattgaag 
ggattcaaga 
cgcttcacct 
tgttggaggt 
acagatagca 
ttgggctata 
ttttgaaaac 
taagaatctt 
tgttcatatg 
aatttgctgt 
gggaaagcac 
ggatcaatac 
agatgatttt 
gttgtttgat 
ggaggattat 
tgaggttggc 
tgcagatagg 
tgttgatcgt 
aggttattgc 
tgctgcaaga 
ggattatgtg 
ctttatgtct 
tcaatcccca 
tcacaagcca 
agaacagacg 
ggactcaaac 
caaactattc 
gagaacatat 
caaatacgct 
ggacaacgac 
cggatcactg 



gaggcgatcg 
tttcagtggt 
tttgttgatg 
gataccagcc 
cttttcattc 
ggccgatatg 
taccctcttt 
atataggtat 
tctgttatat 
aataacttct 
atataggaag 
acgaagagat 
ataagggatt 
gatccaatgg 
aaacaacatt 
gtccattcct 
tctcatcccg 
acagtatgca 
gcgaggcctc 
gaagaaccgc 
aaaaggaata 
tgatatcaca 
atggaaacaa 
aatcttcgtt 
ttatcagaga 
acgaaagttt 
aatcttaaat 
tgggttctgt 
aaaatttgaa 
gatagctaga 
agaaagttta 
aataaacaag 
ggctgggtta 
gagggtaata 
gatccctttg 
atgcttattc 
gaatatattt 
atgccgtttt 
cagtttgatt 
ccagtggaga 
aggaaaaaat 
aggtacatta 
cacatcaact 
ttcagaacag 
tctggtcagg 
caacaagact 
gtgctcgagc 
catcgaattc 
agaaatctgg 
gtacaagatt 
aaagcaatcg 
tcatttttcg 
attgctgccc 
agagaggagg 
tggacttgtg 
accggcagac 
ttcattgcta 
tctgagtttg 
gttactgaga 
cggaagatag 



cagatctaat 
tacatatatc 
ttcttgatgg 
aggattatat 
aaaggatgag 
cctttgttgg 
ccataaagga 
tcaaaaatgg 
aaatttcaca 
attatttggt 
gtaacaaaat 
aagatataag 
acatccttct 
ccacaggaac 
ataaaaggtg 
cctaagtatc 
gcagctttca 
gatcgcctca 
atagatgaag 
caggacgaag 
gcaaaactca 
actaaagatc 
ggttggagag 
gttcctcatt 
caatccagac 
ttgcttttgg 
gtcttaaaat 
ttttttttag 
gtctttggag 
atcttatttg 
tatgggagga 
aagctttgac 
tgaatgatga 
tgtggctgaa 
gctattcatc 
aaaggactca 
ggcgtcagag 
attcatacga 
tcgctcggat 
ccacactaga 
ccactctata 
gtatcgatga 
ctaatcctag 
tccgagaaga 
ttgttggttt 
attggagtgg 
atacccttcg 
aatgtgagaa 
ctcttttcca 
acggcacccg 
aagttcttct 
aggcagagca 
gggaaggaaa 
tggtgacggt 
tccctagcca 
atcgccttta 
atgggaatgt 
acccatttcc 
tccgaaatga 
tccatagaaa 



ctaaccaatt 
ttgttttata 
ggctcagaag 
tcagtaagac 
ccagaatctt 
cttcaatatt 
aaacacaata 
ctaaaagaag 
acacacaaaa 
attgggccta 
actgcaagat 
acccaccctg 
atgtttgtgg 
gtaagaatgt 
tgtatcaata 
tagaaaccat 
tgttcatcta 
gttccgctat 
ttagcatcaa 
aacttgtgca 
ctcaaggtgg 
tatacgatag 
ttacgtataa 
ctcataacga 
atattcttga 
ttttaatatt 
tctcatgacg 
tttcgtgatg 
ctaaagtttg 
tgtgggggtt 
gatgtcatat 
taaattggtt 
ggctaattca 
tgacacaatt 
aaccatggct 
ttacgagctc 
ctgggatgct 
tatcccacac 
gcggggattt 
aaatgtgcag 
tcgaactaat 
agccgaggct 
tctaaacgca 
agcagacaga 
cccttctctg 
ttattatgtt 
tggagctgag 
atttccaaca 
gcaccatgat 
gatgcatact 
tgggatccgc 
aatgagatca 
ttcgcacaca 
tgttgttaac 
aatttctcct 
ctggaaagct 
cgagtgtgag 
ttgtcctcct 
acatcagact 
cggatcagag 



acgatacgct 
tgctatcttt 
atttgatatg 
aatcaaattt 
tatagaatga 
ctacatatca 
tgcagatgct 
ttggataaca 
gcccgtaatc 
agcccagctc 
agccccataa 
ccacgtgtca 
acatgatgca 
agatagattt 
ggaactaatt 
ggcgaggatc 
catccagatg 
cgaatctgag 
acagtcgcgg 
gcttaaggat 
agccatggat 
gattgagttt 
agacgatgag 
tcctggttgg 
caccattgtt 
ttaattctct 
tcattaaact 
aaacagagtt 
tttttttatt 
tgttttgaat 
ctggagagat 
aaggatgggc 
cattattttg 
ggggttattc 
tatcttctcc 
aagaaagacc 
atggaaacca 
acttgtggac 
aagtatgaac 
gagagggcat 
acacttctta 
cagttccgta 
gaagcaaagt 
gtgaattatt 
tcaggtgact 
tcaagacctt 
atcatgatgt 
agttttacgt 
ggggtaactg 
tcattgcaag 
cacgagaaag 
aagtatgatg 
gttatactct 
cgcgctgaaa 
gaagtgcagc 
tccatcccag 
aaagctactc 
ccatattcct 
cttgtgtttg 
actgttgtgg 



ttgggtacac 
aaggatctgc 
atacactcta 
tacgtgttca 
ttgcaatcga 
cacaagaatc 
tttttcccac 
aattgacaac 
aagagtctgc 
agagtacgtg 
cgtaccagcc 
catcgtcatg 
tgtaatgtca 
gattttgtcc 
cactcattgg 
tcgtgtgact 
aggcttttcc 
aaccattgca 
attgttgccc 
ctaatccaga 
tccaattcag 
cttgatacag 
tgggagaaag 
aaattgactg 
gagactttat 
cccatggtta 
ctataaccaa 
ctagaagttc 
actgggtttt 
atgtttaata 
ggtggagaga 
agctagagat 
ccataattga 
ctaagaattc 
ggcgtatggg 
ttgcccagca 
cagatatctt 
cagagcctgc 
tttgtccatg 
taaagcttct 
tacctcttgg 
actaccagat 
ttggtacttt 
ctcgtcctgg 
tctttacata 
tcttcaaagc 
catttctgct 
ataagttgac 
gaactgctaa 
accttcagat 
aaaaatctga 
ctcggccagt 
tcaatccatc 
tctcggtttt 
atgacgatac 
ctcttggtct 
cgtctaaact 
gctccaaact 
atgtgaagaa 
gagaagagat 



bO 
120 

iao 

210 
300 
3b0 
120 

iao 

510 
bOO 
bbO 
720 
760 
610 
=100 
^bO 
1020 
IDfiO 
1110 
1200 
12b0 
1320 
1360 
1110 
1500 
ISbO 
lb20 
IbfiO 
1710 
IfiOO 
IflbO 

n20 
nao 

2010 
2100 
21b0 
2220 
2260 
2310 
2100 
21b0 
2520 
2560 
2b10 
2700 
27b0 
2620 

2aao 

2110 
3000 
30b0 
3120 
3160 
3210 
3300 
33b0 
3120 

3iao 

3510 
3bO0 
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aggtatgtac 

gccaattgtt 

cttctcttac 

cactggaggt 

tggtaatgat 

gaaggtcttc 

gatccctctt 

tggtcagaga 

ttggttggag 

aggtgtgatg 

ttctcaagca 

cataggtgct 

atctgtgcgt 

cctccacatt 

agacaagcca 

taaaggaaga 

gttcaaagat 

tatggagatt 

aggacgtgtc 

caagtgaacc 

tctatctctc 

gggttcttat 

tatttgtaaa 

cctcgaggcg 

tcgttaattc 

aaatacagaa 

ccgccttgtg 

ctatgagaag 

ggaatataat 

aagatcacaa 

ttcacttagc 

tcttgtggag 

gagaagaaaa 

ttcatgttca 

ctcagttccg 

gaagttagca 

gaagaacttg 

ctcactcaag 

aatacggatc 

aatcgggctc 

gagacattgt 

agtattaagt 

actagcttcc 

cattgtgaag 

aagcatcact 

catgaggggc 

aacatacaga 

ttagcaccgt 

atgggaaatg 

agagagtttt 

ccgtcgtttg 

ggaaaatgtg 

gtaaacatag 

cgggtgtata 

gatgataggg 

agtgcatctc 

aatctatctc 

tagggttctt 

tgtatttgta 

agacctctta 



tctagtccag 

caacctgatg 

cctaaaacca 

aatacgcttc 

tttgatgacc 

tattcagatc 

caaggaaact 

ttctccgtgc 

attatgctgg 

gataaccgcg 

gaccctgctt 

cacttaaact 

gttccacaat 

gtaaatttca 

aggttcgctc 

caagtaaact 

cttgcagctt 

cttgggtacg 

tcgatctctc 

tgctgaagat 

tctatttttc 

agggtttcgc 

atacttctat 

atcgcagatc 

tagtcatttt 

tttaaatcaa 

tgttgtatta 

gcaggtggca 

atctcttatc 

ctttatcttc 

cccacaaaat 

ctaagtgttc 

ccatggcgag 

tctacatcca 

ctatcgaatc 

tcaaacagtc 

tgcagcttaa 

gtggagccat 

tgttcccgga 

agtattttcg 

tgattgttag 

tttgtcaagt 

cgggtgtgac 

gtaatcctga 

ggtggtggat 

atatcctttt 

ctcttacgag 

ctgatgtgaa 

ttgggtattc 

gtttctttga 

gttccccggt 

ggttgcatca 

aagttaagga 

agcatcaagc 

accgacattt 

catgaaacgg 

tctctatttt 

atagggtttc 

aaatacttct 

attaa 



agagtggagc 

gacatgtagt 

aatgggagaa 

aggatcaagt 

gggaattgat 

tcaatggttt 

actacccaat 

actctcgtca 

acagacggtt 

caatgaccgt 

ccaacactaa 

accccataaa 

acggttcctt 

aggttcctcg 

ttatcctcaa 

gcacaagcat 

caaaggtaaa 

atgaccaaga 

ccatggaaat 

ccgctagagt 

tccagaataa 

tcatgtgttg 

caataaaatt 

tcattatacc 

acattgttgg 

aattgttgaa 

acttgaagtt 

acacaaacaa 

tgatttaatg 

aatattcaca 

actttgtccc 

atattattct 

gatctcgtgt 

gatgaggctt 

tgagaaccat 

gcggattgtt 

ggatctaatc 

ggctctaagg 

tttggcaaaa 

agtcacagtg 

tcatgatggt 

gaaacagatt 

cctgaatgat 

tcagtatggg 

gatgaacact 

cattgaagaa 

gctgaaaccc 

gtcaagagga 

ttttaataga 

tgattacaac 

gtacacattg 

aggtagagga 

aacagataaa 

gggttataaa 

atgtttggat 

atccgctaga 

tctccagaat 

gctcatgtgt 

atcaataaaa 



ttacctgttc 

cacctctgag 

atcacccctc 

ggtcgagata 

tgtccggtac 

ccaaatgagc 

gccatctctc 

atctctcggt 

ggttcgtgat 

ggtatttcac 

cccgaggaac 

cacattcatt 

tgctccttta 

tccatccaaa 

tagacgagct 

ggctaatgaa 

accaacttca 

gctacctcga 

acgagcttat 

ccgcaaaaat 

tgtgtgagta 

agcatataag 

tctaatccta 

gttagaagca 

gttctacatt 

ttatgctaaa 

atcataagaa 

gagtatctaa 

aatccacatg 

acttgttata 

cttatttgcc 

tcttctcaaa 

gacttgagat 

ttccagacgc 

tgcactagtc 

gccctcgaag 

cagacgtttg 

ttgcatagaa 

gatcgtgtgg 

gaaagtttgt 

tactttgaag 

ttctcgcctt 

tgtaagaaca 

aatcatcggt 

gtatgggatg 

gatcattttc 

gcaaagtgtc 

gaagggcttg 

agtgtgtggg 

tgggatataa 

cgagggccta 

gatgagggtg 

gttgtgaaca 

gccggtttcg 

tttgccacta 

gtccgcaaaa 

aatgtgtgag 

tgagcatata 

tttctaatcc 



aaaccagatg 

ggtctgctgg 

tctcagaaaa 

gaatatcatg 

aagactgatg 

aggagagaaa 

gcatttatcc 

gttgcaagcc 

gacggacggg 

cttcttgcgg 

ccttcgcttc 

gccaagaaac 

gccaaaccgt 

tactctcagc 

tgggattcag 

ccagtaaact 

ctgaatctct 

gatagttcac 

aagcttgaac 

caccagtctc 

gttcccagat 

aaacccttag 

aaaccaaaat 

tagttaaaat 

attaatgaat 

catgtaacat 

ccacaaatac 

gattttcatt 

ttcacttctc 

tccaccacaa 

accttttgta 

aaaacaaaaa 

ttcttctcat 

aatcacagta 

aaatgcgagg 

atatgaagaa 

aaaaaaaagg 

ggaaccattt 

ttatcgtctt 

cgaaggttaa 

agatgaatag 

attcgcctca 

agggtgatga 

ctccgaagat 

ggttggaaga 

tgtttcctaa 

ctgactgttt 

aaagtttggt 

agaatattca 

cgatgtgggc 

ggactagtgc 

attgcatcga 

taaaagaagg 

aaggttgggg 

tgtatcgtta 

atcaccagtc 

tagttcccag 

agaaaccctt 

taaaaccaaa 



gtgaagctca 

ttcaagaagt 

ctcgtcttta 

ttgagcttct 

ttgacaacaa 

cttatgataa 

aaggatccaa 

tcaaagaggg 

gtctagggca 

aatctaacat 

tctctcacct 

cgcaagacat 

taccatgtga 

aattggaaga 

cttattgcca 

tttccgacat 

tgcaagaaga 

agccacggga 

tgcgacctca 

tctctacaaa 

aagggaatta 

tatgtatttg 

cccgcgcgcg 

ctaaagcttg 

tttctaatgc 

acgtatatct 

actagtaaat 

tgtgactata 

atttgtccac 

tttcattctt 

tttaatttat 

caaaaaaaaa 

cccggcagct 

tgcagatcgc 

cctcatagat 

ccgccaggac 

aatagcaaaa 

ttcgcctaga 

gtatgtgcat 

aggtataagt 

gattgtggag 

tatatatcgt 

ggcaaagggg 

tgtatctttg 

gactaaagga 

tgcctatcgt 

tgctgctaat 

tgcagagaga 

tcagaaggca 

aacggttttc 

ggtacacttt 

taatggggtc 

atggggagtt 

aggttggggc 

cagcagtagc 

tctctctaca 

ataagggaat 

agtatgtatt 

atcccgcgag 



3bb0 
3750 
3760 
3fl4Q 
3^00 
31b0 
4020 
4060 

4140 

4200 
42bD 
4320 
4360 
4440 
4500 
45b0 
4b20 
4b60 
4740 
4&00 
46bQ 
4^20 
4^60 
5040 
5100 
SlbO 
5220 
5260 
5340 
5400 
541*0 
5520 
55&0 
5b40 
5700 
57b0 
5820 
SflflO 
ST40 
bOOO 
bDbQ 
bl20 

biao 

b24Q 
b3D0 
b3b0 
b420 
b460 
b540 
bbOO 
bbbO 
b720 
b760 
b640 
blOO 
b^bO 
7020 
7060 
7140 
7155 



<210> 26 

<211> 4=H3 

<212> DNA 

<213> hybrid 
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ggcgcgcctc 
ttgatttttg 
acaaagatta 
atctttagga 
aactcgttat 
gaatatgttc 
gaccgtattg 
atgcagtaac 
tatttccatt 
ccatgtacga 
ggggtaccac 
tctccttacc 
gtggttaatg 
tgagccacag 
gttagatagc 
attcatagaa 
tgagatttct 
agacgcaatc 
ctagtcaaat 
tcgaagatat 
cgtttgaaaa 
gcgccgtcgt 
atggtggtcc 
agaagctcaa 
tagaggagta 
ctaaggtatg 
tcccgtgaac 
acttctttgc 
gttcttttgg 
gagattgaag 
ggattcaaga 
cgcttcacct 
tgttggaggt 
acagatagca 
ttgggctata 
ttttgaaaac 
taagaatctt 
tgttcatatg 
aatttgctgt 
gggaaagcac 
ggatcaatac 
agatgatttt 
gttgtttgat 
ggaggattat 
tgaggttggc 
tgcagatagg 
tgttgatcgt 
aggttattgc 
tgctgcaaga 
ggattatgtg 
ctttatgtct 
tcaatcccca 
tcacaagcca 
agaacagacg 
ggactcaaac 
caaactattc 
gagaacatat 
caaatacgct 
ggacaacgac 
cggatcactg 
aggtatgtac 
gccaattgtt 
cttctcttac 
cactggaggt 
tggtaatgat 
gaaggtcttc 
gatccctctt 
tggtcagaga 



gaggcgatcg 
tttcagtggt 
tttgttgatg 
gataccagcc 
cttttcattc 
ggccgatatg 
taccctcttt 
atataggtat 
tctgttatat 
aataacttct 
atataggaag 
acgaagagat 
ataagggatt 
gatccaatgg 
aaacaacatt 
gtccattcct 
tctcatcccg 
acagtatgca 
gcgaggcctc 
gaagaaccgc 
aaaaggaata 
tgatatcaca 
atggaaacaa 
aatcttcgtt 
ttatcagaga 
acgaaagttt 
aatcttaaat 
tgggttctgt 
aaaatttgaa 
gatagctaga 
agaaagttta 
aataaacaag 
ggctgggtta 
gagggtaata 
gatccctttg 
atgcttattc 
gaatatattt 
atgccgtttt 
cagtttgatt 
ccagtggaga 
aggaaaaaat 
aggtacatta 
cacatcaact 
ttcagaacag 
tctggtcagg 
caacaagact 
gtgctcgagc 
catcgaattc 
agaaatctgg 
gtacaagatt 
aaagcaatcg 
tcatttttcg 
attgctgccc 
agagaggagg 
tggacttgtg 
accggcagac 
ttcattgcta 
tctgagtttg 
gttactgaga 
cggaagatag 
tctagtccag 
caacctgatg 
cctaaaacca 
aatacgcttc 
tttgatgacc 
tattcagatc 
caaggaaact 
ttctccgtgc 



cagatctaat 
tacatatatc 
ttcttgatgg 
aggattatat 
aaaggatgag 
cctttgttgg 
ccataaagga 
tcaaaaatgg 
aaatttcaca 
attatttggt 
gtaacaaaat 
aagatataag 
acatccttct 
ccacaggaac 
ataaaaggtg 
cctaagtatc 
gcagctttca 
gatcgcctca 
atagatgaag 
caggacgaag 
gcaaaactca 
actaaagatc 
ggttggagag 
gttcctcatt 
caatccagac 
ttgcttttgg 
gtcttaaaat 
ttttttttag 
gtctttggag 
atcttatttg 
tatgggagga 
aagctttgac 
tgaatgatga 
tgtggctgaa 
gctattcatc 
aaaggactca 
ggcgtcagag 
attcatacga 
tcgctcggat 
ccacactaga 
ccactctata 
gtatcgatga 
ctaatcctag 
tccgagaaga 
ttgttggttt 
attggagtgg 
atacccttcg 
aatgtgagaa 
ctcttttcca 
acggcacccg 
aagttcttct 
aggcagagca 
gggaaggaaa 
tggtgacggt 
tccctagcca 
atcgccttta 
atgggaatgt 
acccatttcc 
tccgaaatga 
tccatagaaa 
agagtggagc 
gacatgtagt 
aatgggagaa 
aggatcaagt 
gggaattgat 
tcaatggttt 
actacccaat 
actctcgtca 



ctaaccaatt 
ttgttttata 
ggctcagaag 
tcagtaagac 
ccagaatctt 
cttcaatatt 
aaacacaata 
ctaaaagaag 
acacacaaaa 
attgggccta 
actgcaagat 
acccaccctg 
atgtttgtgg 
gtaagaatgt 
tgtatcaata 
tagaaaccat 
tgttcatcta 
gttccgctat 
ttagcatcaa 
aacttgtgca 
ctcaaggtgg 
tatacgatag 
ttacgtataa 
ctcataacga 
atattcttga 
ttttaatatt 
tctcatgacg 
tttcgtgatg 
ctaaagtttg 
tgtgggggtt 
gatgtcatat 
taaattggtt 
ggctaattca 
tgacacaatt 
aaccatggct 
ttacgagctc 
ctgggatgct 
tatcccacac 
gcggggattt 
aaatgtgcag 
tcgaactaat 
agccgaggct 
tctaaacgca 
agcagacaga 
cccttctctg 
ttattatgtt 
tggagctgag 
atttccaaca 
gcaccatgat 
gatgcatact 
tgggatccgc 
aatgagatca 
ttcgcacaca 
tgttgttaac 
aatttctcct 
ctggaaagct 
cgagtgtgag 
ttgtcctcct 
acatcagact 
cggatcagag 
ttacctgttc 
cacctctgag 
atcacccctc 
ggtcgagata 
tgtccggtac 
ccaaatgagc 
gccatctctc 
atctctcggt 



acgatacgct 
tgctatcttt 
atttgatatg 
aatcaaattt 
tatagaatga 
ctacatatca 
tgcagatgct 
ttggataaca 
gcccgtaatc 
agcccagctc 
agccccataa 
ccacgtgtca 
acatgatgca 
agatagattt 
ggaactaatt 
ggcgaggatc 
catccagatg 
cgaatctgag 
acagtcgcgg 
gcttaaggat 
agccatggat 
gattgagttt 
agacgatgag 
tcctggttgg 
caccattgtt 
ttaattctct 
tcattaaact 
aaacagagtt 
tttttttatt 
tgttttgaat 
ctggagagat 
aaggatgggc 
cattattttg 
ggggttattc 
tatcttctcc 
aagaaagacc 
atggaaacca 
acttgtggac 
aagtatgaac 
gagagggcat 
acacttctta 
cagttccgta 
gaagcaaagt 
gtgaattatt 
tcaggtgact 
tcaagacctt 
atcatgatgt 
agttttacgt 
ggggtaactg 
tcattgcaag 
cacgagaaag 
aagtatgatg 
gttatactct 
cgcgctgaaa 
gaagtgcagc 
tccatcccag 
aaagctactc 
ccatattcct 
cttgtgtttg 
actgttgtgg 
aaaccagatg 
ggtctgctgg 
tctcagaaaa 
gaatatcatg 
aagactgatg 
aggagagaaa 
gcatttatcc 
gttgcaagcc 



ttgggtacac 
aaggatctgc 
atacactcta 
tacgtgttca 
ttgcaatcga 
cacaagaatc 
tttttcccac 
aattgacaac 
aagagtctgc 
agagtacgtg 
cgtaccagcc 
catcgtcatg 
tgtaatgtca 
gattttgtcc 
cactcattgg 
tcgtgtgact 
aggcttttcc 
aaccattgca 
attgttgccc 
ctaatccaga 
tccaattcag 
cttgatacag 
tgggagaaag 
aaattgactg 
gagactttat 
cccatggtta 
ctataaccaa 
ctagaagttc 
actgggtttt 
atgtttaata 
ggtggagaga 
agctagagat 
ccataattga 
ctaagaattc 
ggcgtatggg 
ttgcccagca 
cagatatctt 
cagagcctgc 
tttgtccatg 
taaagcttct 
tacctcttgg 
actaccagat 
ttggtacttt 
ctcgtcctgg 
tctttacata 
tcttcaaagc 
catttctgct 
ataagttgac 
gaactgctaa 
accttcagat 
aaaaatctga 
ctcggccagt 
tcaatccatc 
tctcggtttt 
atgacgatac 
ctcttggtct 
cgtctaaact 
gctccaaact 
atgtgaagaa 
gagaagagat 
gtgaagctca 
ttcaagaagt 
ctcgtcttta 
ttgagcttct 
ttgacaacaa 
cttatgataa 
aaggatccaa 
tcaaagaggg 



bO 
1ED 
IAD 
2M0 
3DD 
3b0 
M20 

tao 

bQO 
bbO 
750 
760 
6M0 
100 
IbO 

loeo 

lOflO 
11M0 
1200 

i2bo 

1320 
1360 

i«mo 

1500 
ISbD 
lb20 
IbfiO 
17MQ 
lflOO 
IflbO 
1120 

nao 

20M0 
2100 
21b0 
2220 
22fl0 
23M0 
2400 
24bO 
2520 
2560 
2b40 
2700 
27b0 
2620 
2660 
21M0 
3000 
3DbO 
3120 
3160 
32M0 
3300 
33bD 
3M20 
3460 
3540 
3b00 
3bb0 
3720 
3760 
3640 
3100 
31b0 
4020 
4060 
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ttggttggag attatgctgg acagacggtt ggttcgtgat gacggacggg gtctagggca 414Q 

aggtgtgatg gataaccgcg caatgaccgt ggtatttcac cttcttgcgg aatctaacat 450D 

ttctcaagca gaccctgctt ccaacactaa cccgaggaac ccttcgcttc tctctcacct 42b0 

cataggtgct cacttaaact accccataaa cacattcatt gccaagaaac cgcaagacat 4320 

atctgtgcgt gttccacaat acggttcctt tgctccttta gccaaaccgt taceatgtga 4360 

cctccacatt gtaaatttca aggttcctcg tccatccaaa tactctcagc aattggaaga 4440 

agacaagcca aggttcgctc ttatcctcaa tagacgagct tgggattcag cttattgcca 4SO0 

taaaggaaga caagtaaact gcacaagcat ggctaatgaa ccagtaaact tttccgacat 45bO 

gttcaaagat cttgcagctt caaaggtaaa accaacttca ctgaatctct tgcaagaaga 4b20 

tatggagatt cttgggtacg atgaccaaga gctacctcga gatagttcac agccacggga 4b60 

aggacgtgtc tcgatctctc ccatggaaat acgagcttat aagcttgaac tgcgacctca 4740 

caagtgaacc tgctgaagat ccgctagagt ccgcaaaaat caccagtctc tctctacaaa 4600 

tctatctctc tctatttttc tccagaataa tgtgtgagta gttcccagat aagggaatta 46b0 

gggttcttat agggtttcgc tcatgtgttg agcatataag aaacccttag tatgtatttg 4S20 

tatttgtaaa atacttctat caataaaatt tctaatccta aaaccaaaat cccgcgagag 4^60 

acctcttaat taa 4en3 



<210> 2T 

<211> 3625 

<212> DNA 

<213> hybrid 

<400> 2T 

ccatggcgag gatctcgtgt gacttgagat 
tctacatcca gatgaggctt ttccagacgc 
ctatcgaatc tgagaaccat tgcactagtc 
tcaaacagtc gcggattgtt gccctcgaag 
tgcagcttaa ggatctaatc cagacgtttg 
gtggagccat ggattccaat tcaggcgccg 
ataggattga gtttcttgat acagatggtg 
ataaagacga tgagtgggag aaagagaagc 
acgatcctgg ttggaaattg actgtagagg 
ttgacaccat tgttgagact ttatctaagg 
tattttaatt ctctcccatg gttatcccgt 
gacgtcatta aactctataa ccaaacttct 
gatgaaacag agttctagaa gttcgttctt 
tttgtttttt tattactggg ttttgagatt 
ggtttgtttt gaatatgttt aataggattc 
atatctggag agatggtgga gagacgcttc 
ggttaaggat gggcagctag agattgttgg 
ttcacattat tttgccataa ttgaacagat 
aattggggtt attcctaaga attcttgggc 
ggcttatctt ctccggcgta tgggttttga 
gctcaagaaa gaccttgccc agcataagaa 
tgctatggaa accacagata tctttgttca 
acacacttgt ggaccagagc ctgcaatttg 
atttaagtat gaactttgtc catggggaaa 
gcaggagagg gcattaaagc ttctggatca 
taatacactt cttatacctc ttggagatga 
ggctcagttc cgtaactacc agatgttgtt 
cgcagaagca aagtttggta ctttggagga 
cagagtgaat tattctcgtc ctggtgaggt 
tctgtcaggt gacttcttta catatgcaga 
tgtttcaaga cctttcttca aagctgttga 
tgagatcatg atgtcatttc tgctaggtta 
aacaagtttt acgtataagt tgactgctgc 
tgatggggta actggaactg ctaaggatta 
tacttcattg caagaccttc agatctttat 
ccgccacgag aaagaaaaat ctgatcaatc 
atcaaagtat gatgctcggc cagttcacaa 
cacagttata ctcttcaatc catcagaaca 
taaccgcgct gaaatctcgg ttttggactc 
tcctgaagtg cagcatgacg ataccaaact 
agcttccatc ccagctcttg gtctgagaac 
tgagaaagct actccgtcta aactcaaata 
tcctccatat tcctgctcca aactggacaa 
gactcttgtg tttgatgtga agaacggatc 



ttcttctcat cccggcagct ttcatgttca bO 

aatcacagta tgcagatcgc ctcagttccg 120 

aaatgcgagg cctcatagat gaagttagca 160 

atatgaagaa ccgccaggac gaagaacttg 240 

aaaaaaaagg aatagcaaaa ctcactcaag 300 

tcgttgatat cacaactaaa gatctatacg 3bO 

gtccatggaa acaaggttgg agagttacgt 420 

tcaaaatctt cgttgttcct cattctcata 460 

agtattatca gagacaatcc agacatattc 540 

tatgacgaaa gtttttgctt ttggttttaa bOO 

gaacaatctt aaatgtctta aaattctcat bbO 

ttgctgggtt ctgttttttt ttagtttcgt 720 

ttggaaaatt tgaagtcttt ggagctaaag 760 

gaaggatagc tagaatctta tttgtgtggg 640 

aagaagaaag tttatatggg aggagatgtc ^00 

acctaataaa caagaagctt tgactaaatt IbQ 

aggtggctgg gttatgaatg atgaggctaa 1020 

agcagagggt aatatgtggc tgaatgacac 1060 

tatagatccc tttggctatt catcaaccat 1140 

aaacatgctt attcaaagga ctcattacga 1200 

tcttgaatat atttggcgtc agagctggga 12b0 

tatgatgccg ttttattcat acgatatccc 1320 

ctgtcagttt gatttcgctc ggatgcgggg 1360 

gcacccagtg gagaccacac tagaaaatgt 1440 

atacaggaaa aaatccactc tatatcgaac 1500 

ttttaggtac attagtatcg atgaagccga ISbO 

tgatcacatc aactctaatc ctagtctaaa lb2D 

ttatttcaga acagtccgag aagaagcaga lb60 

tggctctggt caggttgttg gtttcccttc 1740 

taggcaacaa gactattgga gtggttatta 1600 

tcgtgtgctc gagcataccc ttcgtggagc 16b0 

ttgccatcga attcaatgtg agaaatttcc 1^20 

aagaagaaat ctggctcttt tccagcacca 1160 

tgtggtacaa gattacggca cccggatgca 2040 

gtctaaagca atcgaagttc ttcttgggat 2100 

cccatcattt ttcgaggcag agcaaatgag 21b0 

gccaattgct gcccgggaag gaaattcgca 2220 

gacgagagag gaggtggtga cggttgttgt 2260 

aaactggact tgtgtcccta gccaaatttc 2340 

attcaccggc agacatcgcc tttactggaa 2400 

atatttcatt gctaatggga atgtcgagtg 24b0 

cgcttctgag tttgacccat ttccttgtcc 2520 

cgacgttact gagatccgaa atgaacatca 2560 

actgcggaag atagtccata gaaacggatc 2b40 
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agagactgtt gtgggagaag agataggtat 
gttcaaacca gatggtgaag ctcagccaat 
tgagggtctg ctggttcaag aagtcttctc 
cctctctcag aaaactcgtc tttacactgg 
gatagaatat catgttgagc ttcttggtaa 
gtacaagact gatgttgaca acaagaaggt 
gagcaggaga gaaacttatg ataagatccc 
tctcgcattt atccaaggat ccaatggtca 
cggtgttgca agcctcaaag agggttggtt 
tgatgacgga cggggtctag ggcaaggtgt 
tcaccttctt gcggaatcta acatttctca 
gaacccttcg cttctctctc acctcatagg 
cattgccaag aaaccgcaag acatatctgt 
tttagccaaa ccgttaccat gtgacctcca 
caaatactct cagcaattgg aagaagacaa 
agcttgggat tcagcttatt gccataaagg 
tgaaccagta aacttttccg acatgttcaa 
ttcactgaat ctcttgcaag aagatatgga 
tcgagatagt tcacagccac gggaaggacg 
ttataagctt gaactgcgac ctcacaagtg 
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gtactctagt ccagagagtg gagcttacct 2700 

tgttcaacct gatggacatg tagtcacctc 27b0 

ttaccctaaa accaaatggg agaaatcacc 2A20 

aggtaatacg cttcaggatc aagtggtcga 2AA0 

tgattttgat gaccgggaat tgattgtccg 2TM0 

cttctattca gatctcaatg gtttccaaat 3D0D 

tcttcaagga aactactacc caatgccatc 30b0 

gagattctcc gtgcactctc gtcaatctct 3120 

ggagattatg ctggacagac ggttggttcg 31A0 

gatggataac cgcgcaatga ccgtggtatt 3240 

agcagaccct gcttccaaca ctaacccgag 3300 

tgctcactta aactacccca taaacacatt 33b0 

gcgtgttcca caatacggtt cctttgctcc 3420 

cattgtaaat ttcaaggttc ctcgtccatc 34A0 

gccaaggttc gctcttatcc tcaatagacg 3540 

aagacaagta aactgcacaa gcatggctaa 3b00 

agatcttgca gcttcaaagg taaaaccaac 3bbO 

gattcttggg tacgatgacc aagagctacc 3720 

tgtctcgatc tctcccatgg aaatacgagc 37A0 

aacctgctga agate 3B25 



<210> 30 

<211> 21A1 

<212> DNA 

<213> hybrid 

<400> 30 

ggcgcgcctc gaggegateg cagatctcat 
agcttgtcgt taattctagt cattttacat 
taatgcaaat acagaattta aatcaaaatt 
atatctccgc cttgtgtgtt gtattaactt 
gtaaatctat gagaaggcag gtggcaacac 
actataggaa tataatatct cttatctgat 
gtccacaaga tcacaacttt atcttcaata 
attcttttca cttagcccca caaaatactt 
atttattctt gtggagctaa gtgttcatat 
aaaaaagaga agaaaaccat ggegaggate 
gcagctttca tgttcatcta catccagatg 
gatcgcctca gttcegctat cgaatctgag 
atagatgaag ttagcatcaa acagtcgegg 
caggacgaag aacttgtgca gcttaaggat 
gcaaaactca ctcaaggtgg agecatgget 
cctagaaata cggatctgtt cccggatttg 
gtgeataate gggctcagta ttttcgagtc 
ataagtgaga cattgttgat tgttagtcat 
gtggagagta ttaagttttg tcaagtgaaa 
tategtacta gcttcccggg tgtgaccctg 
aaggggcatt gtgaaggtaa tcctgatcag 
tctttgaagc atcactggtg gtggatgatg 
aaaggacatg aggggcatat ccttttcatt 
tategtaaca tacagactct tacgaggctg 
gctaatttag caccgtctga tgtgaagtca 
gagagaatgg gaaatgttgg gtattctttt 
aaggcaagag agttttgttt ctttgatgat 
gttttcccgt cgtttggttc cccggtgtac 
cactttggaa aatgtgggtt gcatcaaggt 
ggggtcgtaa acatagaagt taaggaaaca 
ggagttcggg tgtataagca teaagegggt 
tggggcgatg atagggaccg acatttatgt 
agtagcagtg catctccatg aaaeggatec 
tctacaaatc tatctctctc tatttttctc 
gggaattagg gttcttatag ggtttcgctc 
tgtatttgta tttgtaaaat acttctatca 
cgegagagae ctcttaatta a 



tatacegtta gaagcatagt taaaatctaa bQ 

tgttgggttc tacattatta atgaattttc 120 

gttgaattat gctaaacatg taacatacgt IAD 

gaagttatca taagaaccac aaatacacta 240 

aaacaagagt atctaagatt ttcatttgtg 300 

ttaatgaatc cacatgttca cttctcattt 3bD 

ttcacaactt gttatatcca ccacaatttc 420 

tgtcccctta tttgccacct tttgtattta MAO 

tattcttctt ctcaaaaaaa caaaaacaaa S4D 

tcgtgtgact tgagatttct tctcatcccg bOD 

aggcttttcc agaegcaate acagtatgea bbD 

aaccattgea ctagtcaaat gcgaggcctc 720 

attgttgccc tcgaagatat gaagaacege 7B0 

ctaatccaga cgtttgaaaa aaaaggaata AMD 

ctaaggttgc atagaaggaa ccatttttcg TOO 

gcaaaagatc gtgtggttat cgtcttgtat TbO 

acagtggaaa gtttgtcgaa ggttaaaggt 1020 

gatggttact ttgaagagat gaataggatt 10A0 

cagattttct cgccttattc gectcatata 1140 

aatgattgta agaacaaggg tgatgaggca 1200 

tatgggaatc atcggtctcc gaagattgta 12b0 

aacactgtat gggatgggtt ggaagagact 1320 

gaagaagatc attttctgtt tcctaatgcc 13AD 

aaacccgcaa agtgtcctga ctgttttgct 1440 

agaggagaag ggcttgaaag tttggttgca 1500 

aatagaagtg tgtgggagaa tattcatcag 15b0 

tacaactggg atataacgat gtgggcaacg lb20 

acattgegag ggectaggae tagtgcggta IbAO 

agaggagatg agggtgattg catcgataat 1740 

gataaagttg tgaacataaa agaaggatgg 1A00 

tataaagecg gtttcgaagg ttggggaggt lAbO 

ttggattttg ccactatgta tegttacage 1120 

gctagagtcc gcaaaaatca ccagtctctc ITAO 

cagaataatg tgtgagtagt tcccagataa 2040 

atgtgttgag catataagaa acccttagta 2100 

ataaaatttc taatcctaaa accaaaatcc 21b0 

21A1 



<210> 31 
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<211> 13^4 
<212> DNA 
<213> hybrid 



<400> 31 

ccatggcgag 

tctacatcca 

ctatcgaatc 

tcaaacagtc 

tgcagcttaa 

gtggagccat 

tgttcccgga 

agtattttcg 

tgattgttag 

tttgtcaagt 

cgggtgtgac 

gtaatcctga 

ggtggtggat 

atatcctttt 

ctcttacgag 

ctgatgtgaa 

ttgggtattc 

gtttctttga 

gttccccggt 

ggttgcatca 

aagttaagga 

agcatcaagc 

accgacattt 

catgaaacgg 



gatctcgtgt 
gatgaggctt 
tgagaaccat 
gcggattgtt 
ggatctaatc 
ggctctaagg 
tttggcaaaa 
agtcacagtg 
tcatgatggt 
gaaacagatt 
cctgaatgat 
tcagtatggg 
gatgaacact 
cattgaagaa 
gctgaaaccc 
gtcaagagga 
ttttaataga 
tgattacaac 
gtacacattg 
aggtagagga 
aacagataaa 
gggttataaa 
atgtttggat 
atcc 



gacttgagat 
ttccagacgc 
tgcactagtc 
gccctcgaag 
cagacgtttg 
ttgcatagaa 
gatcgtgtgg 
gaaagtttgt 
tactttgaag 
ttctcgcctt 
tgtaagaaca 
aatcatcggt 
gtatgggatg 
gatcattttc 
gcaaagtgtc 
gaagggcttg 
agtgtgtggg 
tgggatataa 
cgagggccta 
gatgagggtg 
gttgtgaaca 
gccggtttcg 
tttgccacta 



ttcttctcat 
aatcacagta 
aaatgcgagg 
atatgaagaa 
aaaaaaaagg 
ggaaccattt 
ttatcgtctt 
cgaaggttaa 
agatgaatag 
attcgcctca 
agggtgatga 
ctccgaagat 
ggttggaaga 
tgtttcctaa 
ctgactgttt 
aaagtttggt 
agaatattca 
cgatgtgggc 
ggactagtgc 
attgcatcga 
taaaagaagg 
aaggttgggg 
tgtatcgtta 



cccggcagct 
tgcagatcgc 
cctcatagat 
ccgccaggac 
aatagcaaaa 
ttcgcctaga 
gtatgtgcat 
aggtataagt 
gattgtggag 
tatatatcgt 
ggcaaagggg 
tgtatctttg 
gactaaagga 
tgcctatcgt 
tgctgctaat 
tgcagagaga 
tcagaaggca 
aacggttttc 
ggtacacttt 
taatggggtc 
atggggagtt 
aggttggggc 
cagcagtagc 



ttcatgttca 
ctcagttccg 
gaagttagca 
gaagaacttg 
ctcactcaag 
aatacggatc 
aatcgggctc 
gagacattgt 
agtattaagt 
actagcttcc 
cattgtgaag 
aagcatcact 
catgaggggc 
aacatacaga 
ttagcaccgt 
atgggaaatg 
agagagtttt 
ccgtcgtttg 
ggaaaatgtg 
gtaaacatag 
cgggtgtata 
gatgataggg 
agtgcatctc 



bO 
120 
1A0 
240 
3DD 
3b0 
420 
460 
540 

too 

bbQ 
720 
7AD 
S40 
=100 
^bO 
1020 
1QA0 

imo 

1200 
12b0 
1320 
13AQ 
13^4 



<210> 32 

<211> 312 

<212> DNA 

<213> Arabidopsis thaliana 



<400> 32 

ccatggcgag 

tctacatcca 

ctatcgaatc 

tcaaacagtc 

tgcagcttaa 

gtggagccat 



gatctcgtgt 
gatgaggctt 
tgagaaccat 
gcggattgtt 
ggatctaatc 
99 



gacttgagat ttcttctcat cccggcagct ttcatgttca 
ttccagacgc aatcacagta tgcagatcgc ctcagttccg 
tgcactagtc aaatgcgagg cctcatagat gaagttagca 
gccctcgaag atatgaagaa ccgccaggac gaagaacttg 
cagacgtttg aaaaaaaagg aatagcaaaa ctcactcaag 



bO 
120 
IfiO 
240 
300 
312 



<210> 33 

<211> 27b 

<212> DNA 

<213> Glycine max 

<40Q> 33 

ccatggcgag agggagcaga tcagtgggta gcagcagcag caaatggagg tactgcaacc 
cttcctatta cttgaagcgc ccaaagcgtc ttgctctgct cttcatcgtt ttcgtttgtg 
tctctttcgt tttctgggac cgtcaaactc tcgtcagaga gcaccaggtt gaaatttctg 
agctgcagaa agaagtgact gatttgaaaa atttggtgga tgatttaaat aacaaacaag 
gtggtacctc tgggaaaact gacttgggga ccatgg 



bO 
120 
IAD 
240 
27b 



<210> 34 

<211> 1240 

<212> DNA 

<213> hybrid 

<400> 34 

ggcgcgcctc gaggcgatcg cagatccgat ataacaaaat ttgaatcgca cagatcgatc bO 

tctttggaga ttctatacct agaaaatgga gacgattttc aaatctctgt aaaaattctg 120 

gtttcttctt gacggaagaa gacgacgact ccaatatttc ggttagtact gaaccggaaa ISO 

gtttgactgg tgcaaccaat ttaatgtacc gtacgtaacg caccaatcgg attttgtatt 240 
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caatgggcct 
tttatttcag 
ggtgatcctc 
catgttgtga 
aaaaaccgtc 
gccgtacacg 
attcggagaa 
gcagcagcag 
ttgctctgct 
tcgtcagaga 
atttggtgga 
ccatgggaca 
ttgaaaggac 
tatttatatc 
aattaacata 
tgactgcgta 
aacacaaatt 
ttgattactt 
catcatggaa 
cagatttttt 
caaagtggcc 
gccaattcat 
tgggacagtt 
ggaaagcaaa 
tagtgagaca 
aggatgatgt 
ttggtatatt 
tgtttcgaat 
ttggaattcg 
ctacaaatct 
ggaattaggg 
gtatttgtat 
gcgcctcgag 
atttttgttt 
aagattattt 
tttaggagat 
tcgttatctt 
tatgttcggc 
cgtattgtac 
cagtaacata 
ttccatttct 
tgtacgaaat 
gtaccacata 
ccttaccacg 
gttaatgata 
gccacaggat 
agatagcaaa 
catagaagtc 
gtagcagcag 
gtcttgctct 
ctctcgtcag 
aaaatttggt 
ggaccatgga 
ggattgagtt 
aagacgatga 
atcctggttg 
acaccattgt 
tttaattctc 
gtcattaaac 
gaaacagagt 
gtttttttat 
ttgttttgaa 
tctggagaga 
taaggatggg 
acattatttt 
tggggttatt 
ttatcttctc 
caagaaagac 



tatctgtgag 
cgatccgcga 
gtcaaaccag 
tatttttacc 
cgatcatata 
cgtgaagact 
gctagagagt 
caaatggagg 
cttcatcgtt 
gcaccaggtt 
tgatttaaat 
gatgcctgtg 
tgttaaatca 
tcaggatgga 
tatgcagcac 
ctacaagatt 
tagtcgagtg 
tgaggctgca 
tgataatgga 
tcctggcctt 
aaaggcttac 
tcgaccggaa 
tttcagtcag 
ggacctggga 
agcacgacca 
tcgtatccgg 
tgaagaatgg 
ccagacaaca 
aaattcctga 
atctctctct 
ttcttatagg 
ttgtaaaata 
gcgatcgcag 
cagtggttac 
gttgatgttc 
accagccagg 
ttcattcaaa 
cgatatgcct 
cctctttcca 
taggtattca 
gttatataaa 
aacttctatt 
taggaaggta 
aagagataag 
agggattaca 
ccaatggcca 
caacattata 
cattcctcct 
cagcaaatgg 
gctcttcatc 
agagcaccag 
ggatgattta 
ttccaattca 
tcttgataca 
gtgggagaaa 
gaaattgact 
tgagacttta 
tcccatggtt 
tctataacca 
tctagaagtt 
tactgggttt 
tatgtttaat 
tggtggagag 
cagctagaga 
gccataattg 
cctaagaatt 
cggcgtatgg 
cttgcccagc 



cccattaatt 
cggtttgtat 
taaagctaga 
cgtacgatta 
aatccgcttt 
gacaatatta 
tttctgataa 
tactgcaacc 
ttcgtttgtg 
gaaatttctg 
aacaaacaag 
gctgctgtag 
gttttaacat 
tctgatcaag 
ttggattttg 
gcacgtcact 
attatactag 
gctagtctca 
cagaagcagt 
gggtggatgc 
tgggatgatt 
gtctgtagaa 
tatctggaac 
tacctgacag 
attcaaggtt 
tataaagacc 
aaggatggtg 
agacgtgtat 
tgcggatccg 
atttttctcc 
gtttcgctca 
cttctatcaa 
atctaatcta 
atatatcttg 
ttgatggggc 
attatattca 
ggatgagcca 
ttgttggctt 
taaaggaaaa 
aaaatggcta 
tttcacaaca 
atttggtatt 
acaaaatact 
atataagacc 
tccttctatg 
caggaacgta 
aaaggtgtgt 
aagtatctag 
aggtactgca 
gttttcgttt 
gttgaaattt 
aataacaaac 
ggcgccgtcg 
gatggtggtc 
gagaagctca 
gtagaggagt 
tctaaggtat 
atcccgtgaa 
aacttctttg 
cgttcttttg 
tgagattgaa 
aggattcaag 
acgcttcacc 
ttgttggagg 
aacagatagc 
cttgggctat 
gttttgaaaa 
ataagaatct 



gatgtgacgg 
tcagccaata 
tctggaccgt 
gaaaacttga 
accatcgttg 
tctttttcga 
ccatggcgag 
cttcctatta 
tctctttcgt 
agctgcagaa 
gtggtacctc 
tggttatggc 
atcaaactcc 
ctgtcaagag 
aaccagtggt 
acaagtgggc 
aagatgatat 
tggataggga 
ttgtgcatga 
tcaagagatc 
ggctgagact 
catacaattt 
ctataaagct 
agggaaacta 
ctgaccttgt 
aagtagagtt 
tgcctcgaac 
tcctggttgg 
ctagagtccg 
agaataatgt 
tgtgttgagc 
taaaatttct 
accaattacg 
ttttatatgc 
tcagaagatt 
gtaagacaat 
gaatctttat 
caatattcta 
cacaatatgc 
aaagaagttg 
cacaaaagcc 
gggcctaagc 
gcaagatagc 
caccctgcca 
tttgtggaca 
agaatgtaga 
atcaatagga 
aaaccatggc 
acccttccta 
gtgtctcttt 
ctgagctgca 
aaggtggtac 
ttgatatcac 
catggaaaca 
aaatcttcgt 
attatcagag 
gacgaaagtt 
caatcttaaa 
ctgggttctg 
gaaaatttga 
ggatagctag 
aagaaagttt 
taataaacaa 
tggctgggtt 
agagggtaat 
agatcccttt 
catgcttatt 
tgaatatatt 



cctaaactaa 
gcaatcaatt 
tgaattggtg 
gaaacacatt 
cctataaatt 
attcggagct 
agggagcaga 
cttgaagcgc 
tttctgggac 
agaagtgact 
tgggaaaact 
ctgcagtcgt 
cgttgcttca 
caagtcattg 
cactgaaagg 
actggaccag 
ggaaattgct 
taaaaccatt 
tccctatgcg 
gacttgggat 
aaaggaaaac 
tggtgaacat 
aaacgatgtg 
taccaagtac 
cttaaaggct 
tgaacgcatt 
agcatataaa 
gccagattct 
caaaaatcac 
gtgagtagtt 
atataagaaa 
aatcctaaaa 
atacgctttg 
tatctttaag 
tgatatgata 
caaattttac 
agaatgattg 
catatcacac 
agatgctttt 
gataacaaat 
cgtaatcaag 
ccagctcaga 
cccataacgt 
cgtgtcacat 
tgatgcatgt 
tagatttgat 
actaattcac 
gagagggagc 
ttacttgaag 
cgttttctgg 
gaaagaagtg 
ctctgggaaa 
aactaaagat 
aggttggaga 
tgttcctcat 
acaatccaga 
tttgcttttg 
tgtcttaaaa 
ttttttttta 
agtctttgga 
aatcttattt 
atatgggagg 
gaagctttga 
atgaatgatg 
atgtggctga 
ggctattcat 
caaaggactc 
tggcgtcaga 



atccgaacgg 

atgtagcagt 

caagaaagca 

gataatcgat 

aatatcaata 

caagtttgaa 

tcagtgggta 

ccaaagcgtc 

cgtcaaactc 

gatttgaaaa 

gacttgggga 

gcagactatc 

aaatatcctc 

agctataatc 

cctggcgaac 

ttgttttaca 

ccagacttct 

atggctgctt 

ctataccgat 

gagttatcac 

cataaaggcc 

gggtctagtt 

acggttgact 

ttttctggct 

caaaacataa 

gcaggggaat 

ggagtagtgg 

gtaatgcagc 

cagtctctct 

cccagataag 

cccttagtat 

ccaaaatccc 

ggtacacttg 

gatctgcaca 

cactctaatc 

gtgttcaaac 

caatcgagaa 

aagaatcgac 

ttcccacatg 

tgacaactat 

agtctgccca 

gtacgtgggg 

accagcctct 

cgtcatggtg 

aatgtcatga 

tttgtccgtt 

tcattggatt 

agatcagtgg 

cgcccaaagc 

gaccgtcaaa 

actgatttga 

actgacttgg 

ctatacgata 

gttacgtata 

tctcataacg 

catattcttg 

gttttaatat 

ttctcatgac 

gtttcgtgat 

gctaaagttt 

gtgtgggggt 

agatgtcata 

ctaaattggt 

aggctaattc 

atgacacaat 

caaccatggc 

attacgagct 

gctgggatgc 



300 
3b0 
420 
4A0 
540 
bOQ 
bbO 
720 
?flD 
A40 
=100 
=lhO 
1020 

loao 

1140 
1200 
12b0 
1320 
1360 
1440 
1500 
15b0 
lb20 
IbBQ 
1740 

laoo 
mo 

1=120 
MAO 
2040 
2100 

eito 

2220 
22AQ 
2340 
2400 
24L.0 
2520 
25A0 
2b40 
2700 
27b0 
2A20 
2flA0 
2=140 
3000 
3DbO 
3120 

3iao 

3240 
3300 
33b0 
3420 
34A0 
3540 
3bOD 
3bb0 
3720 
3780 
3A40 
3=100 
3U0 
4020 

4oao 

4140 
4200 
42b0 
4320 
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tatggaaacc acagatatct ttgttcatat gatgccgttt tattcatacg atatcccaca 4380 

cacttgtgga ccagagcctg caatttgctg tcagtttgat ttcgctcgga tgcggggatt 4440 

taagtatgaa ctttgtccat ggggaaagca cccagtggag accacactag aaaatgtgca 4500 

ggagagggca ttaaagcttc tggatcaata caggaaaaaa tccactctat atcgaactaa 4SbO 

tacacttctt atacctcttg gagatgattt taggtacatt agtatcgatg aagccgaggc 4b20 

tcagttccgt aactaccaga tgttgtttga tcacatcaac tctaatccta gtctaaacgc 4b80 

agaagcaaag tttggtactt tggaggatta tttcagaaca gtccgagaag aagcagacag 4740 

agtgaattat tctcgtcctg gtgaggttgg ctctggtcag gttgttggtt tcccttctct 4600 

gtcaggtgac ttctttacat atgcagatag gcaacaagac tattggagtg gttattatgt 48b0 

ttcaagacct ttcttcaaag ctgttgatcg tgtgctcgag catacccttc gtggagctga 4120 

gatcatgatg tcatttctgc taggttattg ccatcgaatt caatgtgaga aatttccaac 4180 

aagttttacg tataagttga ctgctgcaag aagaaatctg gctcttttcc agcaccatga S040 

tggggtaact ggaactgcta aggattatgt ggtacaagat tacggcaccc ggatgcatac 5100 

ttcattgcaa gaccttcaga tctttatgtc taaagcaatc gaagttcttc ttgggatccg SlbO 

ccacgagaaa gaaaaatctg atcaatcccc atcatttttc gaggcagagc aaatgagatc 5220 

aaagtatgat gctcggccag ttcacaagcc aattgctgcc cgggaaggaa attcgcacac 5260 

agttatactc ttcaatccat cagaacagac gagagaggag gtggtgacgg ttgttgttaa 5340 

ccgcgctgaa atctcggttt tggactcaaa ctggacttgt gtccctagcc aaatttctcc 5400 

tgaagtgcag catgacgata ccaaactatt caccggcaga catcgccttt actggaaagc 54b0 

ttccatccca gctcttggtc tgagaacata tttcattgct aatgggaatg tcgagtgtga 5520 

gaaagctact ccgtctaaac tcaaatacgc ttctgagttt gacccatttc cttgtcctcc 5560 

tccatattcc tgctccaaac tggacaacga cgttactgag atccgaaatg aacatcagac 5b40 

tcttgtgttt gatgtgaaga acggatcact gcggaagata gtccatagaa acggatcaga 5700 

gactgttgtg ggagaagaga taggtatgta ctctagtcca gagagtggag cttacctgtt 57b0 

caaaccagat ggtgaagctc agccaattgt tcaacctgat ggacatgtag tcacctctga 5820 

gggtctgctg gttcaagaag tcttctctta ccctaaaacc aaatgggaga aatcacccct 5680 

ctctcagaaa actcgtcttt acactggagg taatacgctt caggatcaag tggtcgagat 5140 

agaatatcat gttgagcttc ttggtaatga ttttgatgac cgggaattga ttgtccggta bOOQ 

caagactgat gttgacaaca agaaggtctt ctattcagat ctcaatggtt tccaaatgag bObO 

caggagagaa acttatgata agatccctct tcaaggaaac tactacccaa tgccatctct bl20 

cgcatttatc caaggatcca atggtcagag attctccgtg cactctcgtc aatctctcgg bl80 

tgttgcaagc ctcaaagagg gttggttgga gattatgctg gacagacggt tggttcgtga b240 

tgacggacgg ggtctagggc aaggtgtgat ggataaccgc gcaatgaccg tggtatttca b300 

ccttcttgcg gaatctaaca tttctcaagc agaccctgct tccaacacta acccgaggaa b3b0 

cccttcgctt ctctctcacc tcataggtgc tcacttaaac taccccataa acacattcat b420 

tgccaagaaa ccgcaagaca tatctgtgcg tgttccacaa tacggttcct ttgctccttt b480 

agccaaaccg ttaccatgtg acctccacat tgtaaatttc aaggttcctc gtccatccaa b540 

atactctcag caattggaag aagacaagcc aaggttcgct cttatcctca atagacgagc bbOO 

ttgggattca gcttattgcc ataaaggaag acaagtaaac tgcacaagca tggctaatga bbbO 

accagtaaac ttttccgaca tgttcaaaga tcttgcagct tcaaaggtaa aaccaacttc b720 

actgaatctc ttgcaagaag atatggagat tcttgggtac gatgaccaag agctacctcg b780 

agatagttca cagccacggg aaggacgtgt ctcgatctct cccatggaaa tacgagctta b640 

taagcttgaa ctgcgacctc acaagtgaac ctgctgaaga tccgctagag tccgcaaaaa blOO 

tcaccagtct ctctctacaa atctatctct ctctattttt ctccagaata atgtgtgagt blbO 

agttcccaga taagggaatt agggttctta tagggtttcg ctcatgtgtt gagcatataa 7020 

gaaaccctta gtatgtattt gtatttgtaa aatacttcta tcaataaaat ttctaatcct 7080 

aaaaccaaaa tcccgcgcgc gcctcgaggc gatcgcagat ctcattatac cgttagaagc 7140 

atagttaaaa tctaaagctt gtcgttaatt ctagtcattt tacattgttg ggttctacat 7200 

tattaatgaa ttttctaatg caaatacaga atttaaatca aaattgttga attatgctaa 72bQ 

acatgtaaca tacgtatatc tccgccttgt gtgttgtatt aacttgaagt tatcataaga 7320 

accacaaata cactagtaaa tctatgagaa ggcaggtggc aacacaaaca agagtatcta 7380 

agattttcat ttgtgactat aggaatataa tatctcttat ctgatttaat gaatccacat 7440 

gttcacttct catttgtcca caagatcaca actttatctt caatattcac aacttgttat 7500 

atccaccaca atttcattct tttcacttag ccccacaaaa tactttgtcc ccttatttgc 75b0 

caccttttgt atttaattta ttcttgtgga gctaagtgtt catattattc ttcttctcaa 7b20 

aaaaacaaaa acaaaaaaaa agagaagaaa accatggcga gagggagcag atcagtgggt 7b60 

agcagcagca gcaaatggag gtactgcaac ccttcctatt acttgaagcg cccaaagcgt 7740 

cttgctctgc tcttcatcgt tttcgtttgt gtctctttcg ttttctggga ccgtcaaact 7800 

ctcgtcagag agcaccaggt tgaaatttct gagctgcaga aagaagtgac tgatttgaaa 78b0 

aatttggtgg atgatttaaa taacaaacaa ggtggtacct ctgggaaaac tgacttgggg 7120 

accatggctc taaggttgca tagaaggaac catttttcgc ctagaaatac ggatctgttc 7180 

ccggatttgg caaaagatcg tgtggttatc gtcttgtatg tgcataatcg ggctcagtat 8Q40 

tttcgagtca cagtggaaag tttgtcgaag gttaaaggta taagtgagac attgttgatt 81D0 

gttagtcatg atggttactt tgaagagatg aataggattg tggagagtat taagttttgt 81b0 

caagtgaaac agattttctc gccttattcg cctcatatat atcgtactag cttcccgggt 8220 

gtgaccctga atgattgtaa gaacaagggt gatgaggcaa aggggcattg tgaaggtaat 8280 

cctgatcagt atgggaatca tcggtctccg aagattgtat ctttgaagca tcactggtgg 8340 

tggatgatga acactgtatg ggatgggttg gaagagacta aaggacatga ggggcatatc 8400 
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cttttcattg 
acgaggctga 
gtgaagtcaa 
tattctttta 
tttgatgatt 
ccggtgtaca 
catcaaggta 
aaggaaacag 
caagcgggtt 
catttatgtt 
aacggatccg 
atttttctcc 
gtttcgctca 
cttctatcaa 



aagaagatca 
aacccgcaaa 
gaggagaagg 
atagaagtgt 
acaactggga 
cattgcgagg 
gaggagatga 
ataaagttgt 
ataaagccgg 
tggattttgc 
ctagagtccg 
agaataatgt 
tgtgttgagc 
taaaatttct 



ttttctgttt 
gtgtcctgac 
gcttgaaagt 
gtgggagaat 
tataacgatg 
gcctaggact 
gggtgattgc 
gaacataaaa 
tttcgaaggt 
cactatgtat 
caaaaatcac 
gtgagtagtt 
atataagaaa 
aatcctaaaa 



cctaatgcct 
tgttttgctg 
ttggttgcag 
attcatcaga 
tgggcaacgg 
agtgcggtac 
atcgataatg 
gaaggatggg 
tggggaggtt 
cgttacagca 
cagtctctct 
cccagataag 
cccttagtat 
ccaaaatccc 



atcgtaacat 
ctaatttagc 
agagaatggg 
aggcaagaga 
ttttcccgtc 
actttggaaa 
gggtcgtaaa 
gagttcgggt 
ggggcgatga 
gtagcagtgc 
ctacaaatct 
ggaattaggg 
gtatttgtat 
gcgagagacc 



acagactctt 
accgtctgat 
aaatgttggg 
gttttgtttc 
gtttggttcc 
atgtgggttg 
catagaagtt 
gtataagcat 
tagggaccga 
atctccatga 
atctctctct 
ttcttatagg 
ttgtaaaata 
tcttaattaa 



6MbO 
6520 
&S&Q 
flbMD 
6700 

a?bo 
aaEO 
aaao 
amo 

"iOOO 

mo 

^120 
HAD 
H2M0 



<210> 35 

<211> 2160 

<212> DNA 

<213> hybrid 



<MQO> 35 

ggcgcgcctc 

tctttggaga 

gtttcttctt 

gtttgactgg 

caatgggcct 

tttatttcag 

ggtgatcctc 

catgttgtga 

aaaaaccgtc 

gccgtacacg 

attcggagaa 

gcagcagcag 

ttgctctgct 

tcgtcagaga 

atttggtgga 

ccatgggaca 

ttgaaaggac 

tatttatatc 

aattaacata 

tgactgcgta 

aacacaaatt 

ttgattactt 

catcatggaa 

cagatttttt 

caaagtggcc 

gccaattcat 

tgggacagtt 

ggaaagcaaa 

tagtgagaca 

aggatgatgt 

ttggtatatt 

tgtttcgaat 

ttggaattcg 

ctacaaatct 

ggaattaggg 

gtatttgtat 

gcgagagacc 



gaggcgatcg 
ttctatacct 
gacggaagaa 
tgcaaccaat 
tatctgtgag 
cgatccgcga 
gtcaaaccag 
tatttttacc 
cgatcatata 
cgtgaagact 
gctagagagt 
caaatggagg 
cttcatcgtt 
gcaccaggtt 
tgatttaaat 
gatgcctgtg 
tgttaaatca 
tcaggatgga 
tatgcagcac 
ctacaagatt 
tagtcgagtg 
tgaggctgca 
tgataatgga 
tcctggcctt 
aaaggcttac 
tcgaccggaa 
tttcagtcag 
ggacctggga 
agcacgacca 
tcgtatccgg 
tgaagaatgg 
ccagacaaca 
aaattcctga 
atctctctct 
ttcttatagg 
ttgtaaaata 
tcttaattaa 



cagatccgat 
agaaaatgga 
gacgacgact 
ttaatgtacc 
cccattaatt 
cggtttgtat 
taaagctaga 
cgtacgatta 
aatccgcttt 
gacaatatta 
tttctgataa 
tactgcaacc 
ttcgtttgtg 
gaaatttctg 
aacaaacaag 
gctgctgtag 
gttttaacat 
tctgatcaag 
ttggattttg 
gcacgtcact 
attatactag 
gctagtctca 
cagaagcagt 
gggtggatgc 
tgggatgatt 
gtctgtagaa 
tatctggaac 
tacctgacag 
attcaaggtt 
tataaagacc 
aaggatggtg 
agacgtgtat 
tgcggatccg 
atttttctcc 
gtttcgctca 
cttctatcaa 



ataacaaaat 
gacgattttc 
ccaatatttc 
gtacgtaacg 
gatgtgacgg 
tcagccaata 
tctggaccgt 
gaaaacttga 
accatcgttg 
tctttttcga 
ccatggcgag 
cttcctatta 
tctctttcgt 
agctgcagaa 
gtggtacctc 
tggttatggc 
atcaaactcc 
ctgtcaagag 
aaccagtggt 
acaagtgggc 
aagatgatat 
tggataggga 
ttgtgcatga 
tcaagagatc 
ggctgagact 
catacaattt 
ctataaagct 
agggaaacta 
ctgaccttgt 
aagtagagtt 
tgcctcgaac 
tcctggttgg 
ctagagtccg 
agaataatgt 
tgtgttgagc 
taaaatttct 



ttgaatcgca 
aaatctctgt 
ggttagtact 
caccaatcgg 
cctaaactaa 
gcaatcaatt 
tgaattggtg 
gaaacacatt 
cctataaatt 
attcggagct 
agggagcaga 
cttgaagcgc 
tttctgggac 
agaagtgact 
tgggaaaact 
ctgcagtcgt 
cgttgcttca 
caagtcattg 
cactgaaagg 
actggaccag 
ggaaattgct 
taaaaccatt 
tccctatgcg 
gacttgggat 
aaaggaaaac 
tggtgaacat 
aaacgatgtg 
taccaagtac 
cttaaaggct 
tgaacgcatt 
agcatataaa 
gccagattct 
caaaaatcac 
gtgagtagtt 
atataagaaa 
aatcctaaaa 



cagatcgatc 
aaaaattctg 
gaaccggaaa 
attttgtatt 
atccgaacgg 
atgtagcagt 
caagaaagca 
gataatcgat 
aatatcaata 
caagtttgaa 
tcagtgggta 
ccaaagcgtc 
cgtcaaactc 
gatttgaaaa 
gacttgggga 
gcagactatc 
aaatatcctc 
agctataatc 
cctggcgaac 
ttgttttaca 
ccagacttct 
atggctgctt 
ctataccgat 
gagttatcac 
cataaaggcc 
gggtctagtt 
acggttgact 
ttttctggct 
caaaacataa 
gcaggggaat 
ggagtagtgg 
gtaatgcagc 
cagtctctct 
cccagataag 
cccttagtat 
ccaaaatccc 



bO 
120 

lao 

2M0 
300 
3b0 
M20 

Mao 

5M0 
bOO 
bbO 
720 
7A0 
6M0 
^00 
IbO 
1020 

ioao 
imo 

1200 
12b0 
1320 
13B0 
1MM0 
1500 
15b0 
lb20 

ibao 

17M0 

laoo 
iabo 

1120 

nao 

20M0 
2100 
21b0 
2160 



<210> 3b 
<211> blS 
<212> DNA 

<213> Arabidopsis thaliana 
<M00> 3b 

ggatccgata taacaaaatt tgaatcgcac agatcgatct ctttggagat tctataccta bO 
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gaaaatggag acgattttca aatctctgta 
acgacgactc caatatttcg gttagtactg 
taatgtaccg tacgtaacgc accaatcgga 
ccattaattg atgtgacggc ctaaactaaa 
ggtttgtatt cagccaatag caatcaatta 
aaagctagat ctggaccgtt gaattggtgc 
gtacgattag aaaacttgag aaacacattg 
atccgcttta ccatcgttgc ctataaatta 
acaatattat ctttttcgaa ttcggagctc 
ttctgataac catgg 
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aaaattctgg tttcttcttg acggaagaag 120 
aaccggaaag tttgactggt gcaaccaatt ISO 
ttttgtattc aatgggcctt atctgtgagc 240 
tccgaacggt ttatttcagc gatccgcgac 300 
tgtagcagtg gtgatcctcg tcaaaccagt 3b0 
aagaaagcac atgttgtgat atttttaccc MHO 
ataatcgata aaaaccgtcc gatcatataa 4fiQ 
atatcaatag ccgtacacgc gtgaagactg 540 
aagtttgaaa ttcggagaag ctagagagtt bOO 

blS 



<210> 37 

<211> 131T 

<B12> DNA 

<B13> hybrid 

<400> 37 

ccatggcgag agggagcaga tcagtgggta 
cttcctatta cttgaagcgc ccaaagcgtc 
tctctttcgt tttctgggac cgtcaaactc 
agctgcagaa agaagtgact gatttgaaaa 
gtggtacctc tgggaaaact gacttgggga 
tggttatggc ctgcagtcgt gcagactatc 
atcaaactcc cgttgcttca aaatatcctc 
ctgtcaagag caagtcattg agctataatc 
aaccagtggt cactgaaagg cctggcgaac 
acaagtgggc actggaccag ttgttttaca 
aagatgatat ggaaattgct ccagacttct 
tggataggga taaaaccatt atggctgctt 
ttgtgcatga tccctatgcg ctataccgat 
tcaagagatc gacttgggat gagttatcac 
ggctgagact aaaggaaaac cataaaggcc 
catacaattt tggtgaacat gggtctagtt 
ctataaagct aaacgatgtg acggttgact 
agggaaacta taccaagtac ttttctggct 
ctgaccttgt cttaaaggct caaaacataa 
aagtagagtt tgaacgcatt gcaggggaat 
tgcctcgaac agcatataaa ggagtagtgg 
tcctggttgg gccagattct gtaatgcagc 



gcagcagcag caaatggagg tactgcaacc bO 

ttgctctgct cttcatcgtt ttcgtttgtg 120 

tcgtcagaga gcaccaggtt gaaatttctg ISO 

atttggtgga tgatttaaat aacaaacaag 2M0 

ccatgggaca gatgcctgtg gctgctgtag 300 

ttgaaaggac tgttaaatca gttttaacat 3bO 

tatttatatc tcaggatgga tctgatcaag 42Q 

aattaacata tatgcagcac ttggattttg 4fl0 

tgactgcgta ctacaagatt gcacgtcact 540 

aacacaaatt tagtcgagtg attatactag bOO 

ttgattactt tgaggctgca gctagtctca bbO 

catcatggaa tgataatgga cagaagcagt 720 

cagatttttt tcctggcctt gggtggatgc 7A0 

caaagtggcc aaaggcttac tgggatgatt fl40 

gccaattcat tcgaccggaa gtctgtagaa TOO 

tgggacagtt tttcagtcag tatctggaac TbO 

ggaaagcaaa ggacctggga tacctgacag 1020 

tagtgagaca agcacgacca attcaaggtt IDfiO 

aggatgatgt tcgtatccgg tataaagacc 1140 

ttggtatatt tgaagaatgg aaggatggtg 1200 

tgtttcgaat ccagacaaca agacgtgtat 12b0 

ttggaattcg aaattcctga tgcggatcc 13n 



<210> 38 
<211> 4=157 
<212> DNA 
<213> hybrid 

<400> 3fi 

ggcgcgcctc gaggcgatcg cagatctaat ctaaccaatt acgatacgct ttgggtacac bO 

ttgatttttg tttcagtggt tacatatatc ttgttttata tgctatcttt aaggatctgc 120 

acaaagatta tttgttgatg ttcttgatgg ggctcagaag atttgatatg atacactcta 160 

atctttagga gataccagcc aggattatat tcagtaagac aatcaaattt tacgtgttca 240 

aactcgttat cttttcattc aaaggatgag ccagaatctt tatagaatga ttgcaatcga 300 

gaatatgttc ggccgatatg cctttgttgg cttcaatatt ctacatatca cacaagaatc 3b0 

gaccgtattg taccctcttt ccataaagga aaacacaata tgcagatgct tttttcccac 420 

atgcagtaac atataggtat tcaaaaatgg ctaaaagaag ttggataaca aattgacaac 4fl0 

tatttccatt tctgttatat aaatttcaca acacacaaaa gcccgtaatc aagagtctgc 540 

ccatgtacga aataacttct attatttggt attgggccta agcccagctc agagtacgtg bOO 

ggggtaccac atataggaag gtaacaaaat actgcaagat agccccataa cgtaccagcc bbO 

tctccttacc acgaagagat aagatataag acccaccctg ccacgtgtca catcgtcatg 720 

gtggttaatg ataagggatt acatccttct atgtttgtgg acatgatgca tgtaatgtca 750 

tgagccacag gatccaatgg ccacaggaac gtaagaatgt agatagattt gattttgtcc fi4D 

gttagatagc aaacaacatt ataaaaggtg tgtatcaata ggaactaatt cactcattgg 'lOQ 

attcatagaa gtccattcct cctaagtatc tagaaaccat ggcgagaggg agcagatcag IbQ 

tgggtagcag cagcagcaaa tggaggtact gcaacccttc ctattacttg aagcgcccaa 1020 

agcgtcttgc tctgctcttc atcgttttcg tttgtgtctc tttcgttttc tgggaccgtc 1050 

aaactctcgt cagagagcac caggttgaaa tttctgagct gcagaaagaa gtgactgatt 1140 

tgaaaaattt ggtggatgat ttaaataaca aacaaggtgg tacctctggg aaaactgact 1200 
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tggggaccat 
ataggattga 
ataaagacga 
acgatcctgg 
ttgacaccat 
tattttaatt 
gacgtcatta 
gatgaaacag 
tttgtttttt 
ggtttgtttt 
atatctggag 
ggttaaggat 
ttcacattat 
aattggggtt 
ggcttatctt 
gctcaagaaa 
tgctatggaa 
acacacttgt 
atttaagtat 
gcaggagagg 
taatacactt 
ggctcagttc 
cgcagaagca 
cagagtgaat 
tctgtcaggt 
tgtttcaaga 
tgagatcatg 
aacaagtttt 
tgatggggta 
tacttcattg 
ccgccacgag 
atcaaagtat 
cacagttata 
taaccgcgct 
tcctgaagtg 
agcttccatc 
tgagaaagct 
tcctccatat 
gactcttgtg 
agagactgtt 
gttcaaacca 
tgagggtctg 
cctctctcag 
gatagaatat 
gtacaagact 
gagcaggaga 
tctcgcattt 
cggtgttgca 
tgatgacgga 
tcaccttctt 
gaacccttcg 
cattgccaag 
tttagccaaa 
caaatactct 
agcttgggat 
tgaaccagta 
ttcactgaat 
tcgagatagt 
ttataagctt 
aaatcaccag 
agtagttccc 
taagaaaccc 
cctaaaacca 



ggattccaat 
gtttcttgat 
tgagtgggag 
ttggaaattg 
tgttgagact 
ctctcccatg 
aactctataa 
agttctagaa 
tattactggg 
gaatatgttt 
agatggtgga 
gggcagctag 
tttgccataa 
attcctaaga 
ctccggcgta 
gaccttgccc 
accacagata 
ggaccagagc 
gaactttgtc 
gcattaaagc 
cttatacctc 
cgtaactacc 
aagtttggta 
tattctcgtc 
gacttcttta 
cctttcttca 
atgtcatttc 
acgtataagt 
actggaactg 
caagaccttc 
aaagaaaaat 
gatgctcggc 
ctcttcaatc 
gaaatctcgg 
cagcatgacg 
ccagctcttg 
actccgtcta 
tcctgctcca 
tttgatgtga 
gtgggagaag 
gatggtgaag 
ctggttcaag 
aaaactcgtc 
catgttgagc 
gatgttgaca 
gaaacttatg 
atccaaggat 
agcctcaaag 
cggggtctag 
gcggaatcta 
cttctctctc 
aaaccgcaag 
ccgttaccat 
cagcaattgg 
tcagcttatt 
aacttttccg 
ctcttgcaag 
tcacagccac 
gaactgcgac 
tctctctcta 
agataaggga 
ttagtatgta 
aaatcccgcg 



tcaggcgccg 
acagatggtg 
aaagagaagc 
actgtagagg 
ttatctaagg 
gttatcccgt 
ccaaacttct 
gttcgttctt 
ttttgagatt 
aataggattc 
gagacgcttc 
agattgttgg 
ttgaacagat 
attcttgggc 
tgggttttga 
agcataagaa 
tctttgttca 
ctgcaatttg 
catggggaaa 
ttctggatca 
ttggagatga 
agatgttgtt 
ctttggagga 
ctggtgaggt 
catatgcaga 
aagctgttga 
tgctaggtta 
tgactgctgc 
ctaaggatta 
agatctttat 
ctgatcaatc 
cagttcacaa 
catcagaaca 
ttttggactc 
ataccaaact 
gtctgagaac 
aactcaaata 
aactggacaa 
agaacggatc 
agataggtat 
ctcagccaat 
aagtcttctc 
tttacactgg 
ttcttggtaa 
acaagaaggt 
ataagatccc 
ccaatggtca 
agggttggtt 
ggcaaggtgt 
acatttctca 
acctcatagg 
acatatctgt 
gtgacctcca 
aagaagacaa 
gccataaagg 
acatgttcaa 
aagatatgga 
gggaaggacg 
ctcacaagtg 
caaatctatc 
attagggttc 
tttgtatttg 
agagacctct 



tcgttgatat 
gtccatggaa 
tcaaaatctt 
agtattatca 
tatgacgaaa 
gaacaatctt 
ttgctgggtt 
ttggaaaatt 
gaaggatagc 
aagaagaaag 
acctaataaa 
aggtggctgg 
agcagagggt 
tatagatccc 
aaacatgctt 
tcttgaatat 
tatgatgccg 
ctgtcagttt 
gcacccagtg 
atacaggaaa 
ttttaggtac 
tgatcacatc 
ttatttcaga 
tggctctggt 
taggcaacaa 
tcgtgtgctc 
ttgccatcga 
aagaagaaat 
tgtggtacaa 
gtctaaagca 
cccatcattt 
gccaattgct 
gacgagagag 
aaactggact 
attcaccggc 
atatttcatt 
cgcttctgag 
cgacgttact 
actgcggaag 
gtactctagt 
tgttcaacct 
ttaccctaaa 
aggtaatacg 
tgattttgat 
cttctattca 
tcttcaagga 
gagattctcc 
ggagattatg 
gatggataac 
agcagaccct 
tgctcactta 
gcgtgttcca 
cattgtaaat 
gccaaggttc 
aagacaagta 
agatcttgca 
gattcttggg 
tgtctcgatc 
aacctgctga 
tctctctatt 
ttatagggtt 
taaaatactt 
taattaa 



cacaactaaa 
acaaggttgg 
cgttgttcct 
gagacaatcc 
gtttttgctt 
aaatgtctta 
ctgttttttt 
tgaagtcttt 
tagaatctta 
tttatatggg 
caagaagctt 
gttatgaatg 
aatatgtggc 
tttggctatt 
attcaaagga 
atttggcgtc 
ttttattcat 
gatttcgctc 
gagaccacac 
aaatccactc 
attagtatcg 
aactctaatc 
acagtccgag 
caggttgttg 
gactattgga 
gagcataccc 
attcaatgtg 
ctggctcttt 
gattacggca 
atcgaagttc 
ttcgaggcag 
gcccgggaag 
gaggtggtga 
tgtgtcccta 
agacatcgcc 
gctaatggga 
tttgacccat 
gagatccgaa 
atagtccata 
ccagagagtg 
gatggacatg 
accaaatggg 
cttcaggatc 
gaccgggaat 
gatctcaatg 
aactactacc 
gtgcactctc 
ctggacagac 
cgcgcaatga 
gcttccaaca 
aactacccca 
caatacggtt 
ttcaaggttc 
gctcttatcc 
aactgcacaa 
gcttcaaagg 
tacgatgacc 
tctcccatgg 
agatccgcta 
tttctccaga 
tcgctcatgt 
ctatcaataa 



gatctatacg 
agagttacgt 
cattctcata 
agacatattc 
ttggttttaa 
aaattctcat 
ttagtttcgt 
ggagctaaag 
tttgtgtggg 
aggagatgtc 
tgactaaatt 
atgaggctaa 
tgaatgacac 
catcaaccat 
ctcattacga 
agagctggga 
acgatatccc 
ggatgcgggg 
tagaaaatgt 
tatatcgaac 
atgaagccga 
ctagtctaaa 
aagaagcaga 
gtttcccttc 
gtggttatta 
ttcgtggagc 
agaaatttcc 
tccagcacca 
cccggatgca 
ttcttgggat 
agcaaatgag 
gaaattcgca 
cggttgttgt 
gccaaatttc 
tttactggaa 
atgtcgagtg 
ttccttgtcc 
atgaacatca 
gaaacggatc 
gagcttacct 
tagtcacctc 
agaaatcacc 
aagtggtcga 
tgattgtccg 
gtttccaaat 
caatgccatc 
gtcaatctct 
ggttggttcg 
ccgtggtatt 
ctaacccgag 
taaacacatt 
cctttgctcc 
ctcgtccatc 
tcaatagacg 
gcatggctaa 
taaaaccaac 
aagagctacc 
aaatacgagc 
gagtccgcaa 
ataatgtgtg 
gttgagcata 
aatttctaat 



12b0 
1320 
1360 
1110 
1500 
lSbO 
lt.20 
IbfiO 
1710 

iaoo 

IfibO 

1120 
nao 

EOMQ 

2100 

21b0 

2220 

2260 

2310 

2100 

HMbO 

2520 

2560 

2b1Q 

2700 

27b0 

2820 

2660 

2110 

3000 

30bQ 

3120 

316D 

32M0 

3300 

33b0 

3120 

3160 

3510 

3b00 

3bb0 

3720 
3760 
3610 
3100 
31b0 
1020 
1060 
1110 
1200 
12b0 
1320 
1360 
1110 
1500 
ISbO 
11,20 
MbflO 
1710 
1600 
IflbD 
1120 
1157 



<210> 31 
<211> 121 
<212> DNA 
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<213> Chrysanthemum x morifoliura 



<400> 

agatctaatc 
acatatatct 
tcttgatggg 
ggattatatt 
aaggatgagc 
ctttgttggc 
cataaaggaa 
caaaaatggc 
aatttcacaa 
ttatttggta 
taacaaaata 
agatataaga 
catccttcta 
cacaggaacg 
taaaaggtgt 
ctaagtatct 



taaccaatta 
tgttttatat 
gctcagaaga 
cagtaagaca 
cagaatcttt 
ttcaatattc 
aacacaatat 
taaaagaagt 
cacacaaaag 
ttgggcctaa 
ctgcaagata 
cccaccctgc 
tgtttgtgga 
taagaatgta 
gtatcaatag 
agaaaccatg 



cgatacgctt 
gctatcttta 
tttgatatga 
atcaaatttt 
atagaatgat 
tacatatcac 
gcagatgctt 
tggataacaa 
cccgtaatca 
gcccagctca 
gccccataac 
cacgtgtcac 
catgatgcat 
gatagatttg 
gaactaattc 
9 



tgggtacact 
aggatctgca 
tacactctaa 
acgtgttcaa 
tgcaatcgag 
acaagaatcg 
ttttcccaca 
attgacaact 
agagtctgcc 
gagtacgtgg 
gtaccagcct 
atcgtcatgg 
gtaatgtcat 
attttgtccg 
actcattgga 



tgatttttgt 
caaagattat 
tctttaggag 
actcgttatc 
aatatgttcg 
accgtattgt 
tgcagtaaca 
atttccattt 
catgtacgaa 
gggtaccaca 
ctccttacca 
tggttaatga 
gagccacagg 
ttagatagca 
ttcatagaag 



ttcagtggtt 
ttgttgatgt 
ataccagcca 
ttttcattca 
gccgatatgc 
accctctttc 
tataggtatt 
ctgttatata 
ataacttcta 
tataggaagg 
cgaagagata 
taagggatta 
atccaatggc 
aacaacatta 
tccattcctc 



bO 
ISO 
IfiO 
240 
300 
3b0 
4B0 

4ao 

540 

too 

bbO 
720 
7fi0 
AMD 
=100 

121 



<210> MO 

<211> 37B C J 

<212> DNA 

<213> hybrid 

<4Q0> 40 

ccatggcgag agggagcaga tcagtgggta 
cttcctatta cttgaagcgc ccaaagcgtc 
tctctttcgt tttctgggac cgtcaaactc 
agctgcagaa agaagtgact gatttgaaaa 
gtggtacctc tgggaaaact gacttgggga 
atatcacaac taaagatcta tacgatagga 
ggaaacaagg ttggagagtt acgtataaag 
tcttcgttgt tcctcattct cataacgatc 
atcagagaca atccagacat attcttgaca 
gaaagttttt gcttttggtt ttaatatttt 
tcttaaatgt cttaaaattc tcatgacgtc 
ggttctgttt ttttttagtt tcgtgatgaa 
aatttgaagt ctttggagct aaagtttgtt 
tagctagaat cttatttgtg tgggggtttg 
aaagtttata tgggaggaga tgtcatatct 
taaacaagaa gctttgacta aattggttaa 
ctgggttatg aatgatgagg ctaattcaca 
gggtaatatg tggctgaatg acacaattgg 
tccctttggc tattcatcaa ccatggctta 
gcttattcaa aggactcatt acgagctcaa 
atatatttgg cgtcagagct gggatgctat 
gccgttttat tcatacgata tcccacacac 
gtttgatttc gctcggatgc ggggatttaa 
agtggagacc acactagaaa atgtgcagga 
gaaaaaatcc actctatatc gaactaatac 
gtacattagt atcgatgaag ccgaggctca 
catcaactct aatcctagtc taaacgcaga 
cagaacagtc cgagaagaag cagacagagt 
tggtcaggtt gttggtttcc cttctctgtc 
acaagactat tggagtggtt attatgtttc 
gctcgagcat acccttcgtg gagctgagat 
tcgaattcaa tgtgagaaat ttccaacaag 
aaatctggct cttttccagc accatgatgg 
acaagattac ggcacccgga tgcatacttc 
agcaatcgaa gttcttcttg ggatccgcca 
atttttcgag gcagagcaaa tgagatcaaa 
tgctgcccgg gaaggaaatt cgcacacagt 
agaggaggtg gtgacggttg ttgttaaccg 
gacttgtgtc cctagccaaa tttctcctga 
cggcagacat cgcctttact ggaaagcttc 
cattgctaat gggaatgtcg agtgtgagaa 



gcagcagcag caaatggagg tactgcaacc bO 

ttgctctgct cttcatcgtt ttcgtttgtg 120 

tcgtcagaga gcaccaggtt gaaatttctg IfiO 

atttggtgga tgatttaaat aacaaacaag 240 

ccatggattc caattcaggc gccgtcgttg 300 

ttgagtttct tgatacagat ggtggtccat 3b0 

acgatgagtg ggagaaagag aagctcaaaa 420 

ctggttggaa attgactgta gaggagtatt 4A0 

ccattgttga gactttatct aaggtatgac S40 

aattctctcc catggttatc ccgtgaacaa bOO 

attaaactct ataaccaaac ttctttgctg bbO 

acagagttct agaagttcgt tcttttggaa 720 

tttttattac tgggttttga gattgaagga 7flD 

ttttgaatat gtttaatagg attcaagaag A40 

ggagagatgg tggagagacg cttcacctaa 100 

ggatgggcag ctagagattg ttggaggtgg TbO 

ttattttgcc ataattgaac agatagcaga 1020 

ggttattcct aagaattctt gggctataga lOfiO 

tcttctccgg cgtatgggtt ttgaaaacat 1140 

gaaagacctt gcccagcata agaatcttga 1200 

ggaaaccaca gatatctttg ttcatatgat 12b0 

ttgtggacca gagcctgcaa tttgctgtca 1320 

gtatgaactt tgtccatggg gaaagcaccc 13A0 

gagggcatta aagcttctgg atcaatacag 1440 

acttcttata cctcttggag atgattttag 1500 

gttccgtaac taccagatgt tgtttgatca ISbO 

agcaaagttt ggtactttgg aggattattt lb20 

gaattattct cgtcctggtg aggttggctc IbflD 

aggtgacttc tttacatatg cagataggca 1740 

aagacctttc ttcaaagctg ttgatcgtgt IflOO 

catgatgtca tttctgctag gttattgcca lflbO 

ttttacgtat aagttgactg ctgcaagaag 1120 

ggtaactgga actgctaagg attatgtggt nflO 

attgcaagac cttcagatct ttatgtctaa 2040 

cgagaaagaa aaatctgatc aatccccatc 2100 

gtatgatgct cggccagttc acaagccaat 21b0 

tatactcttc aatccatcag aacagacgag 2220 

cgctgaaatc tcggttttgg actcaaactg 22A0 

agtgcagcat gacgatacca aactattcac 2340 

catcccagct cttggtctga gaacatattt 2400 

agctactccg tctaaactca aatacgcttc B4b0 
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tgagtttgac ccatttcctt gtcctcctcc 
tactgagatc cgaaatgaac atcagactct 
gaagatagtc catagaaacg gatcagagac 
tagtccagag agtggagctt acctgttcaa 
acctgatgga catgtagtca cctctgaggg 
taaaaccaaa tgggagaaat cacccctctc 
tacgcttcag gatcaagtgg tcgagataga 
tgatgaccgg gaattgattg tccggtacaa 
ttcagatctc aatggtttcc aaatgagcag 
aggaaactac tacccaatgc catctctcgc 
ctccgtgcac tctcgtcaat ctctcggtgt 
tatgctggac agacggttgg ttcgtgatga 
taaccgcgca atgaccgtgg tatttcacct 
ccctgcttcc aacactaacc cgaggaaccc 
cttaaactac cccataaaca cattcattgc 
tccacaatac ggttcctttg ctcctttagc 
aaatttcaag gttcctcgtc catccaaata 
gttcgctctt atcctcaata gacgagcttg 
agtaaactgc acaagcatgg ctaatgaacc 
tgcagcttca aaggtaaaac caacttcact 
tgggtacgat gaccaagagc tacctcgaga 
gatctctccc atggaaatac gagcttataa 
ctgaagatc 



atattcctgc tccaaactgg acaacgacgt 2SEQ 

tgtgtttgat gtgaagaacg gatcactgcg 2560 

tgttgtggga gaagagatag gtatgtactc 2bNQ 

accagatggt gaagctcagc caattgttca 2700 

tctgctggtt caagaagtct tctcttaccc 27b0 

tcagaaaact cgtctttaca ctggaggtaa 2620 

atatcatgtt gagcttcttg gtaatgattt 2660 

gactgatgtt gacaacaaga aggtcttcta 2140 

gagagaaact tatgataaga tccctcttca 3000 

atttatccaa ggatccaatg gtcagagatt 30b0 

tgcaagcctc aaagagggtt ggttggagat 3120 

cggacggggt ctagggcaag gtgtgatgga 3160 

tcttgcggaa tctaacattt ctcaagcaga 3240 

ttcgcttctc tctcacctca taggtgctca 3300 

caagaaaccg caagacatat ctgtgcgtgt 33b0 

caaaccgtta ccatgtgacc tccacattgt 3420 

ctctcagcaa ttggaagaag acaagccaag 3460 

ggattcagct tattgccata aaggaagaca 3540 

agtaaacttt tccgacatgt tcaaagatct 3b00 

gaatctcttg caagaagata tggagattct 3bb0 

tagttcacag ccacgggaag gacgtgtctc 37B0 

gcttgaactg cgacctcaca agtgaacctg 3760 

3761 



<210> 41 
<211> 21M5 
<212> DNA 
<213> hybrid 

<400> m 

ggcgcgcctc gaggcgatcg cagatctcat tataccgtta gaagcatagt taaaatctaa bO 

agcttgtcgt taattctagt cattttacat tgttgggttc tacattatta atgaattttc 120 

taatgcaaat acagaattta aatcaaaatt gttgaattat gctaaacatg taacatacgt 160 

atatctccgc cttgtgtgtt gtattaactt gaagttatca taagaaccac aaatacacta 240 

gtaaatctat gagaaggcag gtggcaacac aaacaagagt atctaagatt ttcatttgtg 30D 

actataggaa tataatatct cttatctgat ttaatgaatc cacatgttca cttctcattt 3b0 

gtccacaaga tcacaacttt atcttcaata ttcacaactt gttatatcca ccacaatttc 42Q 

attcttttca cttagcccca caaaatactt tgtcccctta tttgccacct tttgtattta 460 

atttattctt gtggagctaa gtgttcatat tattcttctt ctcaaaaaaa caaaaacaaa 540 

aaaaaagaga agaaaaccat ggcgagaggg agcagatcag tgggtagcag cagcagcaaa bOQ 

tggaggtact gcaacccttc ctattacttg aagcgcccaa agcgtcttgc tctgctcttc bbO 

atcgttttcg tttgtgtctc tttcgttttc tgggaccgtc aaactctcgt cagagagcac 720 

caggttgaaa tttctgagct gcagaaagaa gtgactgatt tgaaaaattt ggtggatgat 760 

ttaaataaca aacaaggtgg tacctctggg aaaactgact tggggaccat ggctctaagg 640 

ttgcatagaa ggaaccattt ttcgcctaga aatacggatc tgttcccgga tttggcaaaa S00 

gatcgtgtgg ttatcgtctt gtatgtgcat aatcgggctc agtattttcg agtcacagtg TbO 

gaaagtttgt cgaaggttaa aggtataagt gagacattgt tgattgttag tcatgatggt 1020 

tactttgaag agatgaatag gattgtggag agtattaagt tttgtcaagt gaaacagatt 1060 

ttctcgcctt attcgcctca tatatatcgt actagcttcc cgggtgtgac cctgaatgat HMD 

tgtaagaaca agggtgatga ggcaaagggg cattgtgaag gtaatcctga tcagtatggg 1200 

aatcatcggt ctccgaagat tgtatctttg aagcatcact ggtggtggat gatgaacact 12b0 

gtatgggatg ggttggaaga gactaaagga catgaggggc atatcctttt cattgaagaa 1320 

gatcattttc tgtttcctaa tgcctatcgt aacatacaga ctcttacgag gctgaaaccc 1360 

gcaaagtgtc ctgactgttt tgctgctaat ttagcaccgt ctgatgtgaa gtcaagagga 1440 

gaagggcttg aaagtttggt tgcagagaga atgggaaatg ttgggtattc ttttaataga 1500 

agtgtgtggg agaatattca tcagaaggca agagagtttt gtttctttga tgattacaac lSbO 

tgggatataa cgatgtgggc aacggttttc ccgtcgtttg gttccccggt gtacacattg lb20 

cgagggccta ggactagtgc ggtacacttt ggaaaatgtg ggttgcatca aggtagagga lb60 

gatgagggtg attgcatcga taatggggtc gtaaacatag aagttaagga aacagataaa 1740 

gttgtgaaca taaaagaagg atggggagtt cgggtgtata agcatcaagc gggttataaa 1600 

gccggtttcg aaggttgggg aggttggggc gatgataggg accgacattt atgtttggat 16b0 

tttgccacta tgtatcgtta cagcagtagc agtgcatctc catgaaacgg atccgctaga 1T2D 

gtccgcaaaa atcaccagtc tctctctaca aatctatctc tctctatttt tctccagaat 1160 

aatgtgtgag tagttcccag ataagggaat tagggttctt atagggtttc gctcatgtgt 2040 

■ tgagcatata agaaaccctt agtatgtatt tgtatttgta aaatacttct atcaataaaa 2100 

tttctaatcc taaaaccaaa atcccgcgag agacctctta attaa 2145 



WO 03/078637 



25/38 



PCT/IB03/01626 



<21D> 42 

<511> 541 

<212> DNA 

<B13> Solanum tuberosum 



<40Q> 42 

agatctcatt 

attttacatt 

atcaaaattg 

tattaacttg 

tggcaacaca 

ttatctgatt 

tcttcaatat 

aaaatacttt 

tgttcatatt 

9 



ataccgttag 
gttgggttct 
ttgaattatg 
aagttatcat 
aacaagagta 
taatgaatcc 
tcacaacttg 
gtccccttat 
attcttcttc 



aagcatagtt 
acattattaa 
ctaaacatgt 
aagaaccaca 
tctaagattt 
acatgttcac 
ttatatccac 
ttgccacctt 
tcaaaaaaac 



aaaatctaaa 
tgaattttct 
aacatacgta 
aatacactag 
tcatttgtga 
ttctcatttg 
cacaatttca 
ttgtatttaa 
aaaaacaaaa 



gcttgtcgtt 
aatgcaaata 
tatctccgcc 
taaatctatg 
ctataggaat 
tccacaagat 
ttcttttcac 
tttattcttg 
aaaaagagaa 



aattctagtc 
cagaatttaa 
ttgtgtgttg 
agaaggcagg 
ataatatctc 
cacaacttta 
ttagccccac 
tggagctaag 
gaaaaccatg 



bO 
15D 
ISO 
240 
300 
3b0 
420 
460 
SMQ 
541 



<21D> 43 

<211> 1356 

<212> DNA 

<213> hybrid 



<400> 43 

ccatggcgag 

cttcctatta 

tctctttcgt 

agctgcagaa 

gtggtacctc 

atttttcgcc 

tcttgtatgt 

ttaaaggtat 

ataggattgt 

ctcatatata 

atgaggcaaa 

agattgtatc 

aagagactaa 

ctaatgccta 

gttttgctgc 

tggttgcaga 

ttcatcagaa 

gggcaacggt 

gtgcggtaca 

tcgataatgg 

aaggatgggg 

ggggaggttg 

gttacagcag 



agggagcaga 
cttgaagcgc 
tttctgggac 
agaagtgact 
tgggaaaact 
tagaaatacg 
gcataatcgg 
aagtgagaca 
ggagagtatt 
tcgtactagc 
ggggcattgt 
tttgaagcat 
aggacatgag 
tcgtaacata 
taatttagca 
gagaatggga 
ggcaagagag 
tttcccgtcg 
ctttggaaaa 
ggtcgtaaac 
agttcgggtg 
gggcgatgat 
tagcagtgca 



tcagtgggta 
ccaaagcgtc 
cgtcaaactc 
gatttgaaaa 
gacttgggga 
gatctgttcc 
gctcagtatt 
ttgttgattg 
aagttttgtc 
ttcccgggtg 
gaaggtaatc 
cactggtggt 
gggcatatcc 
cagactctta 
ccgtctgatg 
aatgttgggt 
ttttgtttct 
tttggttccc 
tgtgggttgc 
atagaagtta 
tataagcatc 
agggaccgac 
tctccatgaa 



gcagcagcag 
ttgctctgct 
tcgtcagaga 
atttggtgga 
ccatggctct 
cggatttggc 
ttcgagtcac 
ttagtcatga 
aagtgaaaca 
tgaccctgaa 
ctgatcagta 
ggatgatgaa 
ttttcattga 
cgaggctgaa 
tgaagtcaag 
attcttttaa 
ttgatgatta 
cggtgtacac 
atcaaggtag 
aggaaacaga 
aagcgggtta 
atttatgttt 
acggatcc 



caaatggagg 
cttcatcgtt 
gcaccaggtt 
tgatttaaat 
aaggttgcat 
aaaagatcgt 
agtggaaagt 
tggttacttt 
gattttctcg 
tgattgtaag 
tgggaatcat 
cactgtatgg 
agaagatcat 
acccgcaaag 
aggagaaggg 
tagaagtgtg 
caactgggat 
attgcgaggg 
aggagatgag 
taaagttgtg 
taaagccggt 
ggattttgcc 



tactgcaacc 
ttcgtttgtg 
gaaatttctg 
aacaaacaag 
agaaggaacc 
gtggttatcg 
ttgtcgaagg 
gaagagatga 
ccttattcgc 
aacaagggtg 
cggtctccga 
gatgggttgg 
tttctgtttc 
tgtcctgact 
cttgaaagtt 
tgggagaata 
ataacgatgt 
cctaggacta 
ggtgattgca 
aacataaaag 
ttcgaaggtt 
actatgtatc 



bO 
120 
160 
240 
300 
3b0 
420 
460 
540 

too 

bbO 
720 
760 
640 
^00 
ibO 
1020 
1060 
1140 
1200 
121,0 
1320 
1356 



<210> 44 
<211> 237 
<212> DNA 

<213> Artificial Sequence 
<220> 

<B23> Synthetic 
<400> 44 

ggatccgcta gagtccgcaa aaatcaccag 
tttctccaga ataatgtgtg agtagttccc 
tcgctcatgt gttgagcata taagaaaccc 
ctatcaataa aatttctaat cctaaaacca 



tctctctcta caaatctatc tctctctatt bO 

agataaggga attagggttc ttatagggtt 120 

ttagtatgta tttgtatttg taaaatactt 160 

aaatcccgcg agagacctct taattaa 237 



<H10> 45 
<211> 31 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Synthetic 
<MQO> MS 

atactcgagt taacaatgag taaacggaat c 31 



<210> Mb 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<22Q> 

<223> Synthetic 

<M00> Mb 

ttctcgatcg ccgattggtt attc 2M 



<210> M7 

<211> 2M 

<212> DNA 

<213> Artificial Sequence 
<22D> 

<223> Synthetic 

<MDQ> M? 

gccgccgcga tcgggcagtc ctcc 2M 

<210> Mfl 

<211> 3D 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic 

<MOD> Mfl 

aacggatcca cgctagctcg gtgtcccgat 3D 



<210> MT 

<211> 3327 

<212> DNA 

<213> hybrid 



<MDD> Ml 

atgggcatca 

ggtgtcgaca 

ccaactccgg 

atatacccaa 

tccttcgagg 

atcgtggttc 

gagtggaaga 

atgaccttca 

gtcaaacaaa 

ggctgggtga 

gaaggacatc 

gaccccttcg 

accattatac 

gagttttact 

ccgtttgata 

ttcgacttca 

acggaacaca 



agatggagac 
tgaagcactt 
atcaatgccc 
cttttgattt 
atcggtatga 
ctcactcaca 
ccaagaacat 
tttggaccga 
aggcattgaa 
tgccggacga 
actgggtgaa 
gccacggggc 
agagaatcca 
ggctggcgag 
tttattcaat 
ggaagattcc 
acttgcacag 



acattctcag 
caaatcttcc 
tgcattgaag 
tcagccgagc 
aagaattcat 
caacgacccg 
tatcaacaac 
gatatcgttt 
aaaacttatc 
agcctgcacg 
aactaatctc 
cactgtgcct 
ttatgcgtgg 
ttgggctact 
aaaaagcacg 
cggcgaatat 
caaggcaaag 



gtctttgtat 
ctcactcaca 
gaaagcgaag 
tggttgcgta 
aacgacacta 
ggatggctga 
atagtgaaca 
ctgaatgcct 
aaagaaggtc 
catatctatg 
ggcgtcatcc 
tacctgctag 
aaacagtggc 
acgaagccgt 
tgtggcccgc 
tctgaataca 
actttgatag 



acatgttgct 
ccgtcaagag 
cggacatcga 
caaaggaatt 
cacggcctag 
agacgtttga 
aactgcacca 
ggtgggaaag 
gtctcgagat 
cgctaattga 
cgaagacagg 
accagagcgg 
tggcggagcg 
ccatgatagt 
acccttcaat 
cagctaagca 
aggagtacga 



gtggttgtct 
ccgagacgag 
caccgtggcg 
ttgggacaag 
actgaaggta 
acagtacttc 
gtaccccaac 
gtcgcaccct 
cacgacgggc 
ccagtttatt 
atggtctatt 
ccttgaggga 
acagattgag 
gcacaatcag 
ttgtctcagt 
cgaagacatc 
ccgtatcggg 



bO 
120 
IflO 
2MD 
300 
3b0 
M20 
MAO 
5M0 
bOO 
bbD 
720 
780 
fiMO 
=500 
^bO 
1020 
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tccctgactc 

agcgtcgagt 

cacaaggaaa 

atgaaagaaa 

attttcagcg 

aaaatcctcg 

gtatcgaact 

ttagaaaaat 

cacgatgcga 

ttcacaagtc 

cctgaccagt 

ggaaaaccgc 

aatccgttgg 

cgggtgtacg 

atccaagaca 

accatcccgc 

cactgcgtca 

aagaaaatga 

aggaacaccg 

gacgtacaat 

cctcattacg 

caagatgata 

ttgcccttct 

gctattctat 

ttatttatga 

cagaacggat 

tactacccga 

acgaaccacg 

cgtcgaactc 

acgactttcc 

gacactagtg 

gaaagccctt 

ccggtgaacg 

cagtcgttcc 

gacgacgtgc 

tacagctgcg 

aggttcaatg 

ctccgacctc 

aagatcaggt 



cacacaacgt 

ttgatgccca 

tcttcaacgc 

gacatcaaaa 

aaggtaaacc 

cccgtcagtt 

acatcagaca 

cttacgagca 

ttactggaac 

tgtatcactg 

cgttgcactc 

ccaagaagct 

ctgagactcg 

atacacacaa 

acggcaagag 

ccctcacctc 

ttttctgcaa 

tgcctggtga 

gctttctgag 

tcggcgcata 

actcacctga 

acataatcat 

tggtgcacac 

tagagaccga 

gattacagac 

tccagtacca 

tcactaccat 

ctcaaggcgc 

tttatgatga 

agaactggat 

aaccaggttt 

accaagtacc 

tgtacctggt 

tgcagagctt 

tcgaactctt 

ctgtcggaga 

gtctgaacat 

tcacaggtct 

ttaaggacga 



ggtgctggtg 

atacgtcaat 

tgacgtacag 

tatacccagc 

agcgtactgg 

cgaacaccaa 

gatgggtcgc 

gcttatctat 

atcaaagtcc 

catccgcctg 

gcagagcatt 

gcaagtgtcc 

aactgaagtg 

gaggaagcac 

tatcgtaagc 

catctcgtac 

caactgcgaa 

catacaatta 

acaagtctat 

tcaaagtgcc 

gaagaatgtt 

agtgtccgga 

tattaggata 

tgtagatttc 

tgatatacaa 

aaagagggtc 

ggcgtgcctg 

tgctgcatac 

cttcagagga 

tttaattgaa 

caaatttgtt 

gtcgcagact 

ggacactagc 

cccgcccggc 

ccccagcaac 

gaagccagtc 

tcagaacatc 

gagtgacatc 

gctttaa 



ccgctcggag 
tatatgaaaa 
ttcggaactc 
ttaaagggag 
tcaggttact 
ctgcgatcgg 
caaggagagt 
gctcgacgga 
agtgtgatgc 
caggaggccg 
atacaaagcg 
ttcattgaca 
gtcacggtta 
gtcttgtatc 
gacaccacgt 
aagctgcagg 
caataccaga 
gaaaatgcag 
agaaaggaca 
caaagacatt 
ctgcatccct 
cctatttcta 
tacaacgtgc 
gaggcgccac 
aacggtgaca 
aaagtgaata 
caagacgagg 
gaaccaggac 
atcggtgaag 
tccatgccag 
aatgaacgtc 
gcggactacc 
gaggttggcg 
atccacctgg 
gaaagctaca 
gccaagtctc 
actgcagtca 
cacctgaacg 



acgacttcag 
tgtttaacta 
ctctcgatta 
atttcttcgt 
acactactag 
cagagatttt 
tcggagcttc 
acttgggtct 
aagattacgg 
cgctcaccac 
aggttgagtg 
agaagaaagt 
gatccaacac 
agataatgcc 
tcgacataat 
agcacaccaa 
aatccaatgt 
tgctaaaact 
tccggaagag 
ctggtgctta 
acactaatca 
cggaaatcac 
cggacccggt 
ctaagaacag 
ttcccgaatt 
aactaggaat 
agacccggct 
gcttagaagt 
gagtagtcga 
gcgtgacgcg 
gttttggccc 
tgagcaggat 
agatcgaggt 
tcaccctgcg 
tggtactgca 
ccaagttttc 
gcctgaccgg 
ctatggaggt 



atacgagtac 
catcaatgct 
ctttaacgcc 
ttactccgat 
accctaccaa 
attcaccctt 
tgagaaaaag 
gtttcaacat 
aaccaaactg 
catcatgttg 
ggaaacttac 
tatacttttt 
gtccaacatc 
cagcatcaca 
gttcgtggcc 
cacttcccac 
gttccaaatt 
tctcgttaat 
aactgtcgtt 
cctcttcatg 
gaacaacatg 
gaccatgtac 
actgtcgcgt 
agagactgag 
ttacaccgat 
agaagctaat 
cactctgctg 
catgctcgat 
taacaaaccg 
agccaagaga 
cggccagaag 
gttcaattac 
gaagccgtac 
caccatcacc 
ccgaccagga 
gtccaaaacc 
cctgaagtca 
aaaaacttac 



iqad 
1140 

1200 
12L0 
1320 
136D 
1440 
1500 
15b0 
11.20 
IbfiD 
1740 
1600 
IfibO 

n20 
nao 

2040 
2100 
21b0 
2220 

22ao 

2340 
2400 
H4bO 
2520 
2560 
2b40 
2700 
27bD 
2A20 
2660 
2140 
3000 
30b0 
3120 
3160 
3240 
3300 
3327 



<210> 50 

<211> 1106 

<212> PRT 

<213> hybrid 



<400> 50 

Met Gly lie Lys Met Glu Thr His Ser Gin 
IS 10 

Leu Trp Leu Ser Gly Val Asp Plet Lys His 
20 25 

His Thr Val Lys Ser Arg Asp Glu Pro Thr 
35 40 

Leu Lys Glu Ser Glu Ala Asp He Asp Thr 
50 55 

Phe Asp Phe Gin Pro Ser Trp Leu Arg Thr 
bS 70 

Ser Phe Glu Asp Arg Tyr Glu Arg He His 
65 ID 

Arg Leu Lys Val He Val Val Pro His Ser 
100 IDS 



Val Phe Val Tyr 



Phe Lys Ser Ser 
30 

Pro Asp Gin Cys 
* 45 

Val Ala He Tyr 
bO 

Lys Glu Phe Trp 
75 

Asn Asp Thr Thr 



Plet Leu 
15 

Leu Thr 
Pro Ala 
Pro Thr 



Asp Lys 
60 

Arg Pro 
15 



His Asn Asp Pro Gly Trp 
110 
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Leu Lys Thr Phe Glu Gin Tyr Phe Glu Trp Lys Thr Lys Asn He He 
115 120 125 

Asn Asn He Val Asn Lys Leu His Gin Tyr Pro Asn Met Thr Phe He 

130 135 ma 

Trp Thr Glu He Ser Phe Leu Asn Ala Trp Trp Glu Arg Ser His Pro 
145 150 155 IbD 

Val Lys Gin Lys Ala Leu Lys Lys Leu He Lys Glu Gly Arg Leu Glu 
IfeS 170 175 

He Thr Thr Gly Gly Trp Val Met Pro Asp Glu Ala Cys Thr His He 

160 i&5 no 

Tyr Ala Leu He Asp Gin Phe He Glu Gly His His Trp Val Lys Thr 

soo SOS 

Asn Leu Gly Val He Pro Lys Thr Gly Trp Ser He Asp Pro Phe Gly 
BID 215 250 

His Gly Ala Thr Val Pro Tyr Leu Leu Asp Gin Ser Gly Leu Glu Gly 
225 230 235 240 

Thr He He Gin Arg He His Tyr Ala Trp Lys Gin Trp Leu Ala Glu 
215 250 255 

Arg Gin He Glu Glu Phe Tyr Trp Leu Ala Ser Trp Ala Thr Thr Lys 
2b0 2b5 270 

Pro Ser Met He Val His Asn Gin Pro Phe Asp He Tyr Ser He Lys 
275 2B0 265 

Ser Thr Cys Gly Pro His Pro Ser He Cys Leu Ser Phe Asp Phe Arg 
2^0 2^5 300 

Lys He Pro Gly Glu Tyr Ser Glu Tyr Thr Ala Lys His Glu Asp He 
305 310 315 320 

Thr Glu His Asn Leu His Ser Lys Ala Lys Thr Leu He Glu Glu Tyr 
325 330 335 

Asp Arg He Gly Ser Leu Thr Pro His Asn Val Val Leu Val Pro Leu 
340 345 350 

Gly Asp Asp Phe Arg Tyr Glu Tyr Ser Val Glu Phe Asp Ala Gin Tyr 
355 3fc»Q 3bS 

Val Asn Tyr Met Lys Met Phe Asn Tyr He Asn Ala His Lys Glu He 
370 375 360 

Phe Asn Ala Asp Val Gin Phe Gly Thr Pro Leu Asp Tyr Phe Asn Ala 
365 3^0 3^5 MOO 

Met Lys Glu Arg His Gin Asn He Pro Ser Leu Lys Gly Asp Phe Phe 
405 M10 415 

Val Tyr Ser Asp He Phe Ser Glu Gly Lys Pro Ala Tyr Trp Ser Gly 
420 425 430 

Tyr Tyr Thr Thr Arg Pro Tyr Gin Lys He Leu Ala Arg Gin Phe Glu 
435 " 440 445 

His Gin Leu Arg Ser Ala Glu He Leu Phe Thr Leu Val Ser Asn Tyr 
450 455 4b0 



He Arg Gin Met Gly Arg Gin Gly Glu Phe Gly Ala Ser Glu Lys Lys 
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4bS 



470 



475 



460 



Leu Glu Lys Ser Tyr Glu Gin Leu lie Tyr Ala Arg Arg Asn Leu Gly 

Mas 410 415 

Leu Phe Gin His His Asp Ala He Thr Gly Thr Ser Lys Ser Ser Val 
5D0 SDS SID 

Met Gin Asp Tyr Gly Thr Lys Leu Phe Thr Ser Leu Tyr His Cys He 
515 520 525 

Arg Leu Gin Glu Ala Ala Leu Thr Thr He Met Leu Pro Asp Gin Ser 
530 535 540 

Leu His Ser Gin Ser He He Gin Ser Glu Val Glu Trp Glu Thr Tyr 
545 550 555 5b0 

Gly Lys Pro Pro Lys Lys Leu Gin Val Ser Phe He Asp Lys Lys Lys 
SbS 570 575 

Val He Leu Phe Asn Pro Leu Ala Glu Thr Arg Thr Glu Val Val Thr 
560 565 510 

Val Arg Ser Asn Thr Ser Asn He Arg Val Tyr Asp Thr His Lys Arg 
515 bOO b05 

Lys His Val Leu Tyr Gin He Met Pro Ser He Thr He Gin Asp Asn 
blO b!5 b5D 

Gly Lys Ser He Val Ser Asp Thr Thr Phe Asp He (let Phe Val Ala 
b25 b30 b35 b40 

Thr He Pro Pro Leu Thr Ser He Ser Tyr Lys Leu Gin Glu His Thr 
b45 b50 b55 

Asn Thr Ser His His Cys Val He Phe Cys Asn Asn Cys Glu Gin Tyr 
bbO bbS b70 

Gin Lys Ser Asn Val Phe Gin He Lys Lys Met Met Pro Gly Asp He 
b?5 bflO bfiS 

Gin Leu Glu Asn Ala Val Leu Lys Leu Leu Val Asn Arg Asn Thr Gly 
bIQ b^S 700 

Phe Leu Arg Gin Val Tyr Arg Lys Asp He Arg Lys Arg Thr Val Val 
705 710 715 720 

Asp Val Gin Phe Gly Ala Tyr Gin Ser Ala Gin Arg His Ser Gly Ala 
725 730 735 

Tyr Leu Phe Met Pro His Tyr Asp Ser Pro Glu Lys Asn Val Leu His 
740 745 750 

Pro Tyr Thr Asn Gin Asn Asn Met Gin Asp Asp Asn He He He Val 
755 7b0 7bS 

Ser Gly Pro He Ser Thr Glu He Thr Thr Met Tyr Leu Pro Phe Leu 
770 775 760 

Val His Thr He Arg He Tyr Asn Val Pro Asp Pro Val Leu Ser Arg 
7A5 710 715 flOO 

Ala He Leu Leu Glu Thr Asp Val Asp Phe Glu Ala Pro Pro Lys Asn 

flos aio ais 

Arg Glu Thr Glu Leu Phe Met Arg Leu Gin Thr Asp He Gin Asn Gly 



620 



625 



630 
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Asp He Pro Glu Phe Tyr Thr Asp Gin Asn Gly Phe Gin Tyr Gin Lys 
fl35 AMD fl45 

Arg Val Lys Val Asn Lys Leu Gly He Glu Ala Asn Tyr Tyr Pro He 

aso ass abo 

Thr Thr Met Ala Cys Leu Gin Asp Glu Glu Thr Arg Leu Thr Leu Leu 

fits a?o fl?5 aao 

Thr Asn His Ala Gin Gly Ala Ala Ala Tyr Glu Pro Gly Arg Leu Glu 

aas ano ais 

Val riet Leu Asp Arg Arg Thr Leu Tyr Asp Asp Phe Arg Gly He Gly 
TOO =105 ^10 

Glu Gly Val Val Asp Asn Lys Pro Thr Thr Phe Gin Asn Trp He Leu 
T15 120 ^25 

He Glu Ser Met Pro Gly Val Thr Arg Ala Lys Arg Asp Thr Ser Glu 
=130 ^35 ^MO 

Pro Gly Phe Lys Phe Val Asn Glu Arg Arg Phe Gly Pro Gly Gin Lys 
<WS ISO ISS IbO 

Glu Ser Pro Tyr Gin Val Pro Ser Gin Thr Ala Asp Tyr Leu Ser Arg 
ibS =170 175 

Met Phe Asn Tyr Pro Val Asn Val Tyr Leu Val Asp Thr Ser Glu Val 
IflO IAS <n0 

Gly Glu He Glu Val Lys Pro Tyr Gin Ser Phe Leu Gin Ser Phe Pro 
TTS 1000 1D0S 

Pro Gly He His Leu Val Thr Leu Arg Thr He Thr Asp Asp Val 
1010 1015 1020 

Leu Glu Leu Phe Pro Ser Asn Glu Ser Tyr Met Val Leu His Arg 
1025 103D 1035 

Pro Gly Tyr Ser Cys Ala Val Gly Glu Lys Pro Val Ala Lys Ser 
10M0 1015 1050 

Pro Lys Phe Ser Ser Lys Thr Arg Phe Asn Gly Leu Asn He Gin 
1055 lObO 10L.5 

Asn He Thr Ala Val Ser Leu Thr Gly Leu Lys Ser Leu Arg Pro 

1070 1075 ioao 

Leu Thr Gly Leu Ser Asp He His Leu Asn Ala Met Glu Val Lys 
1065 10^0 10^5 

Thr Tyr Lys He Arg Phe Lys Asp Glu Leu 
1100 1105 

<210> 51 

<211> 10ba 

<212> DNA 

<213> hybrid 

<MD0> 51 

atgggcatca agatggagac acattctcag gtctttgtat acatgttgct gtggttgtct bO 

ggtgtcgaca tgcagtcctc cggggagctc cggaccggag gggcccggcc gccgcctcct 120 

ctaggcgcct cctcccagcc gcgcccgggt ggcgactcca gcccagtcgt ggattctggc IflO 

cctggccccg ctagcaactt gacctcggtc ccagtgcccc acaccaccgc actgtcgctg 2M0 

cccgcctgcc ctgaggagtc cccgctgctt gtgggcccca tgctgattga gtttaacatg 300 

cctgtggacc tggagctcgt ggcaaagcag aacccaaatg tgaagatggg cggccgctat 3b0 

gcccccaggg actgcgtctc tcctcacaag gtggccatca tcattccatt ccgcaaccgg 420 
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caggagcacc tcaagtactg gctatattat ttgcacccag tcctgcagcg ccagcagctg MAO 

gactatggca tctatgttat caaccaggcg ggagacacta tattcaatcg tgctaagctc 5M0 

ctcaatgttg gctttcaaga agccttgaag gactatgact acacctgctt tgtgtttagt bOO 

gacgtggacc tcattccaat gaatgaccat aatgcgtaca ggtgtttttc acagccacgg bbO 

cacatttccg ttgcaatgga taagtttgga ttcagcctac cttatgttca gtattttgga 720 

ggtgtctctg ctctaagtaa acaacagttt ctaaccatca atggatttcc taataattat 760 

tggggctggg gaggagaaga tgatgacatt tttaacagat tagtttttag aggcatgtct 640 

atatctcgcc caaatgctgt ggtcgggagg tgtcgcatga tccgccactc aagagacaag TOO 

aaaaatgaac ccaatcctca gaggtttgac cgaattgcac acacaaagga gacaatgctc TbQ 

tctgatggtt tgaactcact cacctaccag gtgctggatg tacagagata cccattgtat 1020 

acccaaatca cagtggacat cgggacaccg agcaaggacg agctttag 10b6 

<51D> 52 

<211> 3SS 

<212> PRT 

<213> hybrid 

<M00> 52 

Met Gly He Lys Met Glu Thr His Ser Gin Val Phe Val Tyr Ret Leu 
1 S 10 15 

Leu Trp Leu Ser Gly Val Asp Net Gin Ser Ser Gly Glu Leu Arg Thr 
20 * 25 30 

Gly Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala Ser Ser Gin Pro Arg 
35 MO M5 

Pro Gly Gly Asp Ser Ser Pro Val Val Asp Ser Gly Pro Gly Pro Ala 
SO ' 55 bO 

Ser Asn Leu Thr Ser Val Pro Val Pro His Thr Thr Ala Leu Ser Leu 
b5 70 75 60 

Pro Ala Cys Pro Glu Glu Ser Pro Leu Leu Val Gly Pro Net Leu He 
65 TO 15 

Glu Phe Asn Plet Pro Val Asp Leu Glu Leu Val Ala Lys Gin Asn Pro 
100 105 110 

Asn Val Lys Met Gly Gly Arg Tyr Ala Pro Arg Asp Cys Val Ser Pro 
115 120 125 

His Lys Val Ala He He He Pro Phe Arg Asn Arg Gin Glu His Leu 
130 135 1M0 

Lys Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu Gin Arg Gin Gin Leu 
1M5 150 155 IbO 

Asp Tyr Gly He Tyr Val He Asn Gin Ala Gly Asp Thr He Phe Asn 
IbS 170 175 

Arg Ala Lys Leu Leu Asn Val Gly Phe Gin Glu Ala Leu Lys Asp Tyr 

160 165 no 

Asp Tyr Thr Cys Phe Val Phe Ser Asp Val Asp Leu He Pro Met Asn 
1T5 200 205 

Asp His Asn Ala Tyr Arg Cys Phe Ser Gin Pro Arg His He Ser Val 
210 215 220 

Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr Val Gin Tyr Phe Gly 
225 230 235 2M0 

Gly Val Ser Ala Leu Ser Lys Gin Gin Phe Leu Thr He Asn Gly Phe 
2M5 2S0 255 
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Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp Asp Asp He Phe Asn 

2bD 2b5 270 

Arg Leu Val Phe Arg Gly Net Ser He Ser Arg Pro Asn Ala Val Val 
275 2A0 265 

Gly Arg Cys Arg Net He Arg His Ser Arg Asp Lys Lys Asn Glu Pro 
2 e !0 215 300 

Asn Pro Gin Arg Phe Asp Arg He Ala His Thr Lys Glu Thr Met Leu 
305 310 315 320 

Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gin Val Leu Asp Val Gin Arg 

325 330 335 

Tyr Pro Leu Tyr Thr Gin He Thr Val Asp He Gly Thr Pro Ser Lys 

3M0 345 350 



Asp Glu Leu 
355 



<210> 53 
<211> 11M 
<212> DNA 
<213> hybrid 

<4D0> 53 

atgggcatca agatggagac acattctcag gtctttgtat acatgttgct gtggttgtct bO 

ggtgtcgaca tgggacagat gcctgtggct gctgtagtgg ttatggcctg cagtcgtgca 120 

gactatcttg aaaggactgt taaatcagtt ttaacatatc aaactcccgt tgcttcaaaa IflO 

tatcctctat ttatatctca ggatggatct gatcaagctg tcaagagcaa gtcattgagc 240 

tataatcaat taacatatat gcagcacttg gattttgaac cagtggtcac tgaaaggcct 300 

ggcgaactga ctgcgtacta caagattgca cgtcactaca agtgggcact ggaccagttg 3b0 

ttttacaaac acaaatttag tcgagtgatt atactagaag atgatatgga aattgctcca 120 

gacttctttg attactttga ggctgcagct agtctcatgg atagggataa aaccattatg MflO 

gctgcttcat catggaatga taatggacag aagcagtttg tgcatgatcc ctatgcgcta SHU 

taccgatcag atttttttcc tggccttggg tggatgctca agagatcgac ttgggatgag bOO 

ttatcaccaa agtggccaaa ggcttactgg gatgattggc tgagactaaa ggaaaaccat bbO 

aaaggccgcc aattcattcg accggaagtc tgtagaacat acaattttgg tgaacatggg 720 

tctagtttgg gacagttttt cagtcagtat ctggaaccta taaagctaaa cgatgtgacg 7A0 

gttgactgga aagcaaagga cctgggatac ctgacagagg gaaactatac caagtacttt A40 

tctggcttag tgagacaagc acgaccaatt caaggttctg accttgtctt aaaggctcaa TOO 

aacataaagg atgatgttcg tatccggtat aaagaccaag tagagtttga acgcattgca IbO 

ggggaatttg gtatatttga agaatggaag gatggtgtgc ctcgaacagc atataaagga 1020 

gtagtggtgt ttcgaatcca gacaacaaga cgtgtattcc tggttgggcc agattctgta lOfiO 

atgcagcttg gaattcgaaa ttccaaggac gagctttga 1111 



<210> 5M 

<211> 372 

<212> PRT 

<213> hybrid 

<400> 5M 

Ret Gly He Lys Met 
1 5 

Leu Trp Leu Ser Gly 
20 

Val Val Met Ala Cys 
35 

Ser Val Leu Thr Tyr 
50 



Glu Thr His Ser Gin Val 
10 

Val Asp net Gly Gin Met 
25 

Ser Arg Ala Asp Tyr Leu 
MO 

Gin Thr Pro Val Ala Ser 
55 



Phe Val Tyr net Leu 
15 

Pro Val Ala Ala Val 
30 

Glu Arg Thr Val Lys 
M5 

Lys Tyr Pro Leu Phe 
bO 
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lie Ser Gin Asp Gly Ser Asp Gin Ala Val Lys Ser Lys Ser Leu Ser 
b5 70 75 AO 

Tyr Asn Gin Leu Thr Tyr Ilet Gin His Leu Asp Phe Glu Pro Val Val 
AS TO IS 

Thr Glu Arg Pro Gly Glu Leu Thr Ala Tyr Tyr Lys lie Ala Arg His 
100 105 110 

Tyr Lys Trp Ala Leu Asp Gin Leu Phe Tyr Lys His Lys Phe Ser Arg 
115 120 125 

Val He He Leu Glu Asp Asp Met Glu He Ala Pro Asp Phe Phe Asp 
130 135 140 

Tyr Phe Glu Ala Ala Ala Ser Leu Met Asp Arg Asp Lys Thr He Met 
IMS 150 155 IbD 

Ala Ala Ser Ser Trp Asn Asp Asn Gly Gin Lys Gin Phe Val His Asp 
ItS 170 175 

Pro Tyr Ala Leu Tyr Arg Ser Asp Phe Phe Pro Gly Leu Gly Trp Met 
IfiO lfi5 no 

Leu Lys Arg Ser Thr Trp Asp Glu Leu Ser Pro Lys Trp Pro Lys Ala 
IIS 200 205 

Tyr Trp Asp Asp Trp Leu Arg Leu Lys Glu Asn His Lys Gly Arg Gin 
210 215 220 

Phe He Arg Pro Glu Val Cys Arg Thr Tyr Asn Phe Gly Glu His Gly 
525 " 230 235 240 

Ser Ser Leu Gly Gin Phe Phe Ser Gin Tyr Leu Glu Pro He Lys Leu 
245 250 255 

Asn Asp Val Thr Val Asp Trp Lys Ala Lys Asp Leu Gly Tyr Leu Thr 
2b0 2b5 270 

Glu Gly Asn Tyr Thr Lys Tyr Phe Ser Gly Leu Val Arg Gin Ala Arg 
275 260 255 

Pro He Gin Gly Ser Asp Leu Val Leu Lys Ala Gin Asn He Lys Asp 
2=10 2^5 300 

Asp Val Arg He Arg Tyr Lys Asp Gin Val Glu Phe Glu Arg He Ala 
305 310 315 320 

Gly Glu Phe Gly He Phe Glu Glu Trp Lys Asp Gly Val Pro Arg Thr 
325 330 335 

Ala Tyr Lys Gly Val Val Val Phe Arg He Gin Thr Thr Arg Arg Val 
340 345 350 

Phe Leu Val Gly Pro Asp Ser Val Met Gin Leu Gly He Arg Asn Ser 
355 3b0 3b5 

Lys Asp Glu Leu 
370 

<21Q> 55 

<211> 1156 

<212> DNA 

<213> hybrid 

<400> 55 

atgggcatca agatggagac acattctcag gtctttgtat acatgttgct gtggttgtct bO 
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ggtgtcgaca 
ctgttcccgg 
cagtattttc 
ttgattgtta 
ttttgtcaag 
ccgggtgtga 
ggtaatcctg 
tggtggtgga 
catatccttt 
actcttacga 
tctgatgtga 
gttgggtatt 
tgtttctttg 
ggttccccgg 
gggttgcatc 
gaagttaagg 
aagcatcaag 
gaccgacatt 
ccaaaggacg 



tggctctaag 
atttggcaaa 
gagtcacagt 
gtcatgatgg 
tgaaacagat 
ccctgaatga 
atcagtatgg 
tgatgaacac 
tcattgaaga 
ggctgaaacc 
agtcaagagg 
cttttaatag 
atgattacaa 
tgtacacatt 
aaggtagagg 
aaacagataa 
cgggttataa 
tatgtttgga 
agctttga 



gttgcataga 
agatcgtgtg 
ggaaagtttg 
ttactttgaa 
tttctcgcct 
ttgtaagaac 
gaatcatcgg 
tgtatgggat 
agatcatttt 
cgcaaagtgt 
agaagggctt 
aagtgtgtgg 
ctgggatata 
gcgagggcct 
agatgagggt 
agttgtgaac 
agccggtttc 
ttttgccact 



aggaaccatt 
gttatcgtct 
tcgaaggtta 
gagatgaata 
tattcgcctc 
aagggtgatg 
tctccgaaga 
gggttggaag 
ctgtttccta 
cctgactgtt 
gaaagtttgg 
gagaatattc 
acgatgtggg 
aggactagtg 
gattgcatcg 
ataaaagaag 
gaaggttggg 
atgtatcgtt 



tttcgcctag 
tgtatgtgca 
aaggtataag 
ggattgtgga 
atatatatcg 
aggcaaaggg 
ttgtatcttt 
agactaaagg 
atgcctatcg 
ttgctgctaa 
ttgcagagag 
atcagaaggc 
caacggtttt 
cggtacactt 
ataatggggt 
gatggggagt 
gaggttgggg 
acagcagtag 



aaatacggat 
taatcgggct 
tgagacattg 
gagtattaag 
tactagcttc 
gcattgtgaa 
gaagcatcac 
acatgagggg 
taacatacag 
tttagcaccg 
aatgggaaat 
aagagagttt 
cccgtcgttt 
tggaaaatgt 
cgtaaacata 
tcgggtgtat 
cgatgatagg 
cagtgcatct 



ISO 
IAD 
210 
300 
3bD 
ISO 

iao 

510 
bOO 
bbO 
7S0 
7A0 
AID 
"TOO 
^bO 
1020 
lOfiO 
1140 
1156 



<S10> 5b 

<S11> 3A5 

<31S> PRT 

<313> hybrid 

<100> 5b 

Met Gly He Lys Met Glu Thr His Ser Gin Val Phe Val Tyr (let Leu 
15 10 15 

Leu Trp Leu Ser Gly Val Asp Plet Ala Leu Arg Leu His Arg Arg Asn 
SO 25 30 

His Phe Ser Pro Arg Asn Thr Asp Leu Phe Pro Asp Leu Ala Lys Asp 
35 10 15 

Arg Val Val He Val Leu Tyr Val His Asn Arg Ala Gin Tyr Phe Arg 
50 55 bO 

Val Thr Val Glu Ser Leu Ser Lys Val Lys Gly He Ser Glu Thr Leu 
bS 70 75 AO 

Leu He Val Ser His Asp Gly Tyr Phe Glu Glu Met Asn Arg He Val 
A5 10 <J5 

Glu Ser He Lys Phe Cys Gin Val Lys Gin He Phe Ser Pro Tyr Ser 
100 105 110 

Pro His He Tyr Arg Thr Ser Phe Pro Gly Val Thr Leu Asn Asp Cys 
115 ISO 1S5 

Lys Asn Lys Gly Asp Glu Ala Lys Gly His Cys Glu Gly Asn Pro Asp 
130 135 110 

Gin Tyr Gly Asn His Arg Ser Pro Lys He Val Ser Leu Lys His His 
115 150 155 IbO 

Trp Trp Trp Met Net Asn Thr Val Trp Asp Gly Leu Glu Glu Thr Lys 
lb5 170 175 

Gly His Glu Gly His He Leu Phe He Glu Glu Asp His Phe Leu Phe 
1A0 1A5 no 

Pro Asn Ala Tyr Arg Asn He Gin Thr Leu Thr Arg Leu Lys Pro Ala 
IIS 200 205 



Lys Cys Pro Asp Cys Phe Ala Ala Asn Leu Ala Pro Ser Asp Val Lys 
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510 



215 



220 



Ser Arg Gly 
225 

Val Gly Tyr 

Ala Arg Glu 

Trp Ala Thr 
575 

Gly Pro Arg 

Gly Arg Gly 
305 

Glu Val Lys 
Val Arg Val 



Trp Gly Gly 
355 

Ala Thr Het 
370 



Glu Gly Leu Glu Ser 
530 

Ser Phe Asn Arg Ser 
245 

Phe Cys Phe Phe Asp 
2b0 

Val Phe Pro Ser Phe 
5fi0 

Thr Ser Ala Val His 
2^5 

Asp Glu Gly Asp Cys 
310 

Glu Thr Asp Lys Val 
325 

Tyr Lys His Gin Ala 
340 

Trp Gly Asp Asp Arg 
3b0 

Tyr Arg Tyr Ser Ser 
375 



Leu Val Ala Glu 
235 

Val Trp Glu Asn 
550 

Asp Tyr Asn Trp 
2b5 

Gly Ser Pro Val 



Phe Gly Lys Cys 
300 

lie Asp Asn Gly 
315 

Val Asn He Lys 
330 

Gly Tyr Lys Ala 
345 

Asp Arg His Leu 



Ser Ser Ala Ser 
3A0 



Arg Det Gly Asn 
240 

He His Gin Lys 
555 

Asp He Thr net 
270 

Tyr Thr Leu Arg 
2fl5 

Gly Leu His Gin 



Val Val Asn He 
320 

Glu Gly Trp Gly 
335 

Gly Phe Glu Gly 
350 

Cys Leu Asp Phe 
3b5 

Pro Lys Asp Glu 



Leu 
3A5 



<210> 57 

<211> 1152 

<212> DNA 

<213> Homo sapiens 



<40Q> 57 

atgctgaaga 

aatgccctgc 

agcgctctcg 

gagctccgga 

ccgggtggcg 

tcggtcccag 

ctgcttgtgg 

aagcagaacc 

cacaaggtgg 

tattatttgc 

caggcgggag 

ttgaaggact 

gaccataatg 

tttggattca 

cagtttctaa 

gacattttta 

gggaggtgtc 

tttgaccgaa 

taccaggtgc 

acaccgagct 



agcagtctgc 
tgctcctctt 
atggcgaccc 
ccggaggggc 
actccagccc 
tgccccacac 
gccccatgct 
caaatgtgaa 
ccatcatcat 
acccagtcct 
acactatatt 
atgactacac 
cgtacaggtg 
gcctacctta 
ccatcaatgg 
acagattagt 
gcatgatccg 
ttgcacacac 
tggatgtaca 
ag 



agggcttgtg 
cttctggacg 
cgccagcctc 
ccggccgccg 
agtcgtggat 
caccgcactg 
gattgagttt 
gatgggcggc 
tccattccgc 
gcagcgccag 
caatcgtgct 
ctgctttgtg 
tttttcacag 
tgttcagtat 
atttcctaat 
ttttagaggc 
ccactcaaga 
aaaggagaca 
gagataccca 



ctgtggggcg 
cgcccagcac 
acccgggaag 
cctcctctag 
tctggccctg 
tcgctgcccg 
aacatgcctg 
cgctatgccc 
aaccggcagg 
cagctggact 
aagctcctca 
tttagtgacg 
ccacggcaca 
tttggaggtg 
aattattggg 
atgtctatat 
gacaagaaaa 
atgctctctg 
ttgtataccc 



ctatcctctt 
ctggcaggcc 
tcgacatgca 
gcgcctcctc 
gccccgctag 
cctgccctga 
tggacctgga 
ccagggactg 
agcacctcaa 
atggcatcta 
atgttggctt 
tggacctcat 
tttccgttgc 
tctctgctct 
gctggggagg 
ctcgcccaaa 
atgaacccaa 
atggtttgaa 
aaatcacagt 



tgtggcctgg 
accctcagtc 
gtcctccggg 
ccagccgcgc 
caacttgacc 
ggagtccccg 
gctcgtggca 
cgtctctcct 
gtactggcta 
tgttatcaac 
tcaagaagcc 
tccaatgaat 
aatggataag 
aagtaaacaa 
agaagatgat 
tgctgtggtc 
tcctcagagg 
ctcactcacc 
ggacatcggg 



to 

120 

iao 

240 
300 
3b0 
42D 
460 
540 
bDO 
bt>0 
720 
7fi0 
340 
^00 

uo 

1020 

loao 

1140 
1152 



<210> 5fl 

<211> 3A3 

<212> PRT 

<213> Homo sapiens 
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<40Q> Sfl 

Met Leu Lys Lys Gin Ser Ala Gly Leu Val Leu Trp Gly Ala He Leu 
IS 10 IS 

Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro 
20 25 30 

Ala Pro Gly Arg Pro Pro Ser Val Ser Ala Leu Asp Gly Asp Pro Ala 
35 40 45 

Ser Leu Thr Arg Glu Val Asp (let Gin Ser Ser Gly Glu Leu Arg Thr 
50 55 fc>D 

Gly Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala Ser Ser Gin Pro Arg 
b5 70 75 60 

Pro Gly Gly Asp Ser Ser Pro Val Val Asp Ser Gly Pro Gly Pro Ala 
85 ^0 15 

Ser Asn Leu Thr Ser Val Pro Val Pro His Thr Thr Ala Leu Ser Leu 
100 105 110 

Pro Ala Cys Pro Glu Glu Ser Pro Leu Leu Val Gly Pro (let Leu He 
IIS 120 155 

Glu Phe Asn Net Pro Val Asp Leu Glu Leu Val Ala Lys Gin Asn Pro 
130 135 140 

Asn Val Lys (let Gly Gly Arg Tyr Ala Pro Arg Asp Cys Val Ser Pro 
145 150 155 IbO 

His Lys Val Ala He He He Pro Phe Arg Asn Arg Gin Glu His Leu 
lb5 170 175 

Lys Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu Gin Arg Gin Gin Leu 

lao ias no 

Asp Tyr Gly He Tyr Val He Asn Gin Ala Gly Asp Thr He Phe Asn 
1«=J5 200 205 

Arg Ala Lys Leu Leu Asn Val Gly Phe Gin Glu Ala Leu Lys Asp Tyr 
210 215 220 

Asp Tyr Thr Cys Phe Val Phe Ser Asp Val Asp Leu He Pro Met Asn 
225 230 235 240 

Asp His Asn Ala Tyr Arg Cys Phe Ser Gin Pro Arg His He Ser Val 
245 ~ 250 255 

Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr Val Gin Tyr Phe Gly 
2t0 2b5 270 

Gly Val Ser Ala Leu Ser Lys Gin Gin Phe Leu Thr He Asn Gly Phe 
275 2fi0 235 

Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp Asp Asp He Phe Asn 
2^0 2=15 300 

Arg Leu Val Phe Arg Gly Het Ser He Ser Arg Pro Asn Ala Val Val 
305 310 315 320 

Gly Arg Cys Arg Met He Arg His Ser Arg Asp Lys Lys Asn Glu Pro 
325 330 335 

Asn Pro Gin Arg Phe Asp Arg He Ala His Thr Lys Glu Thr Met Leu 
340 345 350 
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Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gin Val Leu Asp Val Gin Arg 

355 3tO 3L.5 

Tyr Pro Leu Tyr Thr Gin lie Thr Val Asp He Gly Thr Pro Ser 

370 375 3A0 



<bio> 51 

<B11> MOP 
<B1B> PRT 
<E13> Homo sapiens 

<MQ0> 51 

llet Arg Leu Arg Glu Pro Leu Leu Ser Gly Ala Ala Met Pro Gly Ala 
15 ID 15 

Ser Leu Gin Arg Ala Cys Arg Leu Leu Val Ala Val Cys Ala Leu His 
BO B5 30 

Leu Gly Val Thr Leu Val Tyr Tyr Leu Ala Gly Arg Asp Leu Ser Arg 
35 MO M5 

Leu Pro Gin Leu Val Gly Val Ser Thr Pro Leu Gin Gly Gly Ser Asn 
50 55 bO 

Ser Ala Ala Ala He Gly Gin Ser Ser Gly Glu Leu Arg Thr Gly Gly 
bS 70 75 AO 

Ala Arg Pro Pro Pro Pro Leu Gly Ala Ser Ser Gin Pro Arg Pro Gly 
A5 TO =15 

Gly Asp Ser Ser Pro Val Val Asp Ser Gly Pro Gly Pro Ala Ser Asn 
100 105 110 

Leu Thr Ser Val Pro Val Pro His Thr Thr Ala Leu Ser Leu Pro Ala 
115 120 1B5 

Cys Pro Glu Glu Ser Pro Leu Leu Val Gly Pro flet Leu He Glu Phe 
130 135 1M0 

Asn Met Pro Val Asp Leu Glu Leu Val Ala Lys Gin Asn Pro Asn Val 
145 150 155 IbQ 

Lys Met Gly Gly Arg Tyr Ala Pro Arg Asp Cys Val Ser Pro His Lys 
lb5 170 175 

Val Ala He He He Pro Phe Arg Asn Arg Gin Glu His Leu Lys Tyr 
1A0 1A5 110 

Trp Leu Tyr Tyr Leu His Pro Val Leu Gin Arg Gin Gin Leu Asp Tyr 
MS BOO B05 

Gly He Tyr Gly He Tyr Val He Asn Gin Ala Gly Asp Thr He Phe 
B10 BIS BB0 

Asn Arg Ala Lys Leu Leu Asn Val Gly Phe Gin Glu Ala Leu Lys Asp 
BBS " B30 235 B40 

Tyr Asp Tyr Thr Cys Phe Val Phe Ser Asp Val Asp Leu He Pro Met 
B45 BSD BSS 

Asn Asp His Asn Ala Tyr Arg Cys Phe Ser Gin Pro Arg His He Ser 
BbO * 2bS B70 

Val Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr Val Gin Tyr Phe 
B75 BAD BAS 
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Gly Gly Val Ser Ala Leu Ser Lys Gin Gin Phe Leu Thr lie Asn Gly 
EID 2^5 300 

Phe Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp Asp Asp He Phe 
305 310 315 350 

Asn Arg Leu Val Phe Arg Gly (letter He Ser Arg Pro Asn Ala Val 
325 * ■•*> ,330 335 

> 

Val Gly Arg Cys Arg Met He Arg His Ser^Arg Asp Lys Lys Asn Glu 
3i*0 3M5 *? 350 

Pro Asn Pro Gin Arg Phe Asp Arg He Ala His Thr Lys Glu Thr Met 
355 3bQ 3b5 

Leu Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gin Val Leu Asp Val Gin 
370 375 360 

Arg Tyr Pro Leu Tyr Thr Gin He Thr Val Asp He Gly Thr Pro Ser 
3fi5 310 315 MOO 



